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Abstract 

We survey the main results of approximation theory for adaptive piecewise polynomial functions. In such 
methods, the partition on which the piecewise polynomial approximation is defined is not fixed in advance, 
but adapted to the given function / which is approximated. We focus our discussion on (i) the properties that 
describe an optimal partition for /, (ii) the smoothness properties of / that govern the rate of convergence of the 
approximation in the LP-norms, and (iii) fast refinement algorithms that generate near optimal partitions. While 
these results constitute a fairly established theory in the univariate case and in the multivariate case when dealing 
with elements of isotropic shape, the approximation theory for adaptive and anisotropic elements is still building 
up. We put a particular emphasis on some recent results obtained in this direction. 



1 Introduction 

1.1 Piecewise polynomial approximation 

Approximation by piecewise polynomial functions is a procedure that occurs in numerous applications. In some 
of them such as terrain data simplification or image compression, the function / to be approximated might be fuUy 
known, while it might be only partially known or fully unknown in other applications such as denoising, statistical 

learning or in the finite element discretization of PDE's. In all these applications, one usually makes the distinction 
between uniform and adaptive approximation. In the uniform case, the domain of interest is decomposed into a 
partition where all elements have comparable shape and size, while these attributes are allowed to vary strongly 
in the adaptive case. The partition may therefore be adapted to the local properties of /, with the objective of 
optimizing the trade-off between accuracy and complexity of the approximation. This chapter is concerned with 
the following fundamental questions: 

• Which mathematical properties describe an optimally adapted partition for a given function / ? 

• For such optimally adapted partitions, what smoothness properties of / govern the convergence properties 
of the corresponding piecewise polynomial approximations ? 

• Can one construct optimally adapted partitions for a given function / by a fast algorithm ? 

For a given bounded domain H C R** and a fixed integer m > 0, we associate to any partition ^ of H the space 

V^:={f s.t. yirGPm-i,Te=^} 

of piecewise polynomial functions of total degree m — 1 over ,^7 . The dimension of this space measures the 
complexity of a function g e V^/-. It is proportional to the cardinality of the partition: 



^[Vsr) ■■= Crn,d#{^), with C„,d := dim(P„_i) 



m + d- 1 
d 



In order to describe how accurately a given function / may be described by piecewise polynomial functions of a 
prescribed complexity, it is therefore natural to introduce the error of best approximation in a given norm || • ||x 
which is defined as 

0)v(/k := , inf min 11/ - . 

This object of study is too vague if we do not make some basic assumptions that Umitate the set of partitions 
which may be considered. We therefore restrict the definition of the above infimimi to a class s^n of "admissible 
partitions" of complexity at most N. The approximation to / is therefore searched in the set 

I.N := U5?g^j,V^, 
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and the error of best approximation is now defined as 

ON{f)x:= iirf ||/-^||x= inf inf \\f-g\\x. 

The assumptions which define the class ^ are usually of the following type: 

1. The elementary geometry of the elements of 9" . The typical examples that are considered in this chapter 
are: intervals when d = 1, triangles or rectangles when d = 2, simplices when d > 2. 

2. Restrictions on the regularity of the partition, in the sense of the relative size and shape of the elements that 
constitute the partition 

3. Restrictions on the conformity of the partition, which impose that each face of an element T is common to 
at most one adjacent element T'. 

The conformity restriction is critical when imposing global continuity or higher smoothness properties in the 
definition of V^, and if one wants to measure the error in some smooth norm. In this survey, we limitate our 
interest to the approximation error measured in X = LP. We therefore do not impose any global smoothness 
property on the space Vj^ and ignore the conformity requirement. 
Throughout this chapter, we use the notation 

em,sr{f)p ■= min Wf-gWlP, 



to denote the LP approximation error in the space and 

0)v(/)p := 0)v(/)z/ = inf \\f-g\\LP = hif e,„^3r{f)p. 

If r e ^ is an element and / is a function defined on £1, we denote by 

em.T{f)p-= min \\f - TtWu.^r), 



the local approximation error. We thus have 

'm,T(f)p I 

when p <oo and 



«m,jr(/)=o = maxe„ r(/)oo. 

The norm H/Ulp without precision on the domain stands for ||/||ip(£j) where £1 is the full domain where / is 
defined. 

1.2 From uniform to adaptive approximation 

Concerning the restrictions ont the regularity of the partitions, three situations should be distinguished: 

1. Quasi-uniform partitions: all elements have approximately the same size. This may be expressed by a 
restriction of the type 

CiN-'^l^ <pT<hT< CiAf-V'', (1.1) 

for ellT e ^ with J' e .a^, where < Ci < C2 are constants independent of N, and where hr and pr 
respectively denote the diameters of T and of it largest inscribed disc. 

2. Adaptive isotropic partitions: elements may have arbitrarily different size but their aspect ratio is controlled 
by a restriction of the type 

— <C, (1.2) 

Pt 

for all r e ^ with £^ e .e^5v, where C > 1 is independent of N. 

3. Adaptive anisotropic partitions: element may have arbitrarily different size and aspect ratio, i.e. no restric- 
tion is made on hr and pr- 
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A classical result states that if a function / belongs to the Sobolev space W"''''{Q.) the LP error of approximation 
by piecewise polynomial of degree m on a given partition satisfies the estimate 

emMf)p<Ch"'\f\w'"-", (1.3) 

where h := maxj-^^hj is the maximal mesh-size, := ^L|a|=m ^ standard Sobolev 

semi-norm, and C is a constant that only depends on {m,d,p). In the case of quasi-uniform partitions, this yields 
an estimate in terms of complexity: 

ONif)p<CN-"'^''\f\w«:-,', (1.4) 
where the constant C now also depends on Ci and C2 in 

Here and throughout the chapter, C denotes a generic constant which may vary from one equation to the other. 
The dependence of this constant with respect to the relevant parameters will be mentionned when necessary. 

Note that the above estimate can be achieved by restricting the family i<^5v to a single partition: for example, 
we start from a coarse partition ,% into cubes and recursively define a nested sequence of partition by split- 
ting each cube of S^j-x into 2 cubes of half side-length. We then set 

s^N ■■= {.^■}, if#{-%)^''-' <^< #(5J))2''(-'+l). 

Similar uniform refinement rules can be proposed for more general partitions into triangles, simplices or rect- 
angles. With such a choice for s>/f^, the set 1,^ on which one picks the approximation is thus a standard linear 
space. Piecewise polynomials on quasi-uniform partitions may therefore be considered as an instance of linear 
approximation. 

The interest of adaptive partitions is that the choice of 5^ £ s/f^ may vary depending on /, so that the set Z^r is 
inherently a nonlinear space. Piecewise polynomials on adaptive partitions are therefore an instance of nonlinear 
approximation. Other instances include approximation by rational functions, or by A'-term linear combinations 
of a basis or dictionary. We refer to 1281 for a general survey on nonlinear approximation. 

The use of adaptive partitions allows to improve significantly on The theory that describes these im- 

provements is rather well established for adaptive isotropic partitions: as explained further, a typical result for 
such partitions is of the form 

cyN{f)p<CN-"''''\f\w«'-^, (1-5) 



where T can be chosen smaller than p. Such an estimate reveals that the same rate of decay A'^ ? as in 1 1 .4 1 
is achieved for / in a smoothness space which is larger than W'^-P . It also says that for a smooth function, the 
multiplicative constant governing this rate might be substantially smaller than when working with quasi-uniform 
partitions. 

When allowing adaptive anisotropic partitions, one should expect for further improvements. From an intuitive 
point of view, such partitions are needed when the function / itself displays locally anisotropic features such as 
jump discontinuities or sharp transitions along smooth manifolds. The available approximation theory for such 
partitions is still at its infancy. Here, typical estimates are also of the form 

<yN{f)p<CN-'"l''A{f), (1.6) 

but they involve quantities A (/) which are not norms or semi-norms associated with standard smoothness spaces. 
These quantities are highly nonlinear in / in the sense that they do not satisfy A(/ + ^) < C(A(/) +A(g)) even 
withO 1. 



1.3 Outline 

This chapter is organized as follows. As a starter, we study in §2 the simple case of piecewise constant approxi- 
mation on an interval. This example gives a first illustration the difference between the approximation properties 
of uniform and adaptive partitions. It also illustrates the principle of error equidistribution which plays a crucial 
role in the construction of adaptive partitions which are optimally adapted to /. This leads us to propose and 
study a multiresolution greedy refinement algorithm as a design tool for such partitions. The distinction between 
isotropic and anisotropic partitions is irrelevant in this case, since we work with one-dimensional intervals. 

We discuss in §3 the derivation of estimates of the form for adaptive isotropic partitions. The main guid- 
ing principle for the design of the partition is again error equidistribution. Adaptive greedy refinement algorithms 
are discussed, similar to the one-dimensional case. 
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We study in §4 an elementary case of adaptive anisotropic partitions for wliich all elements are two-dimensional 
rectangles with sides that are parallel to the x and y axes. This type of anisotropic partitions suffer from an in- 
trinsic lack of directional selectivity. We limitate our attention to piecewise constant functions, and identify the 
quantity A(/) involved in 1 1.6 1 for this particular case. The main guiding principles for the design of the optimal 
partition are now error equidistribution combined with a local shape optimization of each element. 

In §5, we present some recently available theory for piecewise polynomials on adaptive anisotropic partitions 
into triangles (and simplices in dimension d > 2) which offer more directional selectivity than the previous ex- 
ample. We give a general formula for the quantity A(/) which can be turned into an explicit expression in terms 
of the derivatives of / in certain cases such as piecewise linear functions i.e. m = 2. Due to the fact that A{f) is 
not a semi-norm, the function classes defined by the finiteness of A{f) are not standard smoothness spaces. We 
show that these classes include piecewise smooth objects separated by discontinuities or sharp transitions along 
smooth edges. 

We present in §6 several greedy refinement algorithms which may be used to derive anisotropic partitions. 
The convergence analysis of these algorithms is more delicate than for their isotropic counterpart, yet some first 
results indicate that they tend to generate optimally adapted partitions which satisfy convergence estimates in 
accordance with \l.6\. This behaviour is illustrated by numerical tests on two-dimensional functions. 



2 Piecewise constant one-dimensional approximation 

We consider here the very simple problem of approximating a continuous function by piecewise constants on the 
unit interval [0, 1], when we measure the error in the uniform norm. If / £ C([0, 1]) and / C [0, 1] is an arbitrary 
interval we have 

e 1 ,/ (/)oo : = min 1 1 / - c 1 1 i» (,) = - max I /(x) - /(y ) I . 

The constant c that achieves the minimum is the median of / on /. Remark that we multiply this estimate at most 
by a factor 2 if we take c — f(z) for any z e /. In particular, we may choose for c the average of / on / which is 
still defined when / is not continuous but simply integrable. 

If ,%j = {/i , ■ • • is a partition of [0, 1] into A' sub-intervals and the corresponding space of piecewise 
constant functions, we thus find hat 

<!\Aif)«' ■= Wf-sWh- = 1.^ max max \f{x)-f{y)\. (2.7) 
jeVy^ 2k=l.-j^x,yeh 



2.1 Uniform partitions 

We first study the error of approximation when the ,'7^ are uniform partitions consisting of the intervals 4 = 
[jj , ^-^^]. Assume first that / is a Lipschitz function i.e. /' e L°°. We then have 

max \f{x)-f{y)\ < |4| ||/'||z.»(4) = 

Combining this estimate with ( |2.7^ , we find that for uniform partitions, 

/eLip([0,l])^(7A,(/)o„<CA'-l, (2.8) 

with C = J For less smooth functions, we may obtain lower convergence rates: if / is Holder continuous 

of exponent < a < 1 , we have by definition 

l/W-/WI<l/lc«|.^-y|«, 

which yields 

max|/(x)-/(j)|<yv-«|/|c«. 

We thus find that 

f eC" {[0,1])^ (Jn if U<CN-", (2.9) 

withC=i|/|c«. 

The estimates \2.S\ and \2.9\ are sharp in the sense that they admit a converse: it is easily checked that if / 
is a continuous function such that C7jv(/)oo < CN~^ for some C > 0, it is necessarily Lipschitz. Indeed, for any x 
and y in [0,1], consider an integer A' such that jN^ ^ ^\x — y\< ' . For such an integer, there exists a fy e V^j^ 
such that [ I / - /ai 1 1 i- < CA'" ' . We thus have 

\.m-fiy)\ < + \fN{x)-fN{y)\- 
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Since x and y are eitlier contained in one interval or two adjacent intervals of the partition ^ and since / is 
continuous, we find that |/a?(j:) — fN{y)\ is either zero or less than 2CN^^ . We therefore have 

\m-f{y)\<4CN-'<SC\x-y\, 

which shows that / e Lip([0, 1]). In summary, we have the following result. 

Theorem 2.1 If f is a continuous function defined on [0, 1] and if Oi~j{f)oo denotes the L°° error of piecewise 
constant approximation on uniform partitions, we have 

f<EU^{[0,l\)^aN{f)oo<CN-\ (2.10) 

In an exactly similar way, is can be proved that 

/eC"([0,ll)<=>CTA,(/)oo<CiV-«, (2.11) 

These equivalences reveal that Lipschitz and Holder smoothness are the properties that do govern the rate of 
approximation by piecewise constant functions in the uniform norm. 

The estimate l |2.8[ ) is also optimal in the sense that it describes the saturation rate of piecewise constant 
approximation: a higher convergence rate cannot be obtained, even for smoother functions, and the constant 
C = 2 II/' Hi" cannot be improved. In order to see this, consider an arbitrary function / e C' ([0, 1]), so that for all 
e > 0, there exists 7] > such that 

\x-y\<^^\f\x)-f{y)\<£. 

Therefore if A' is such that N^^ < Tj , we can introduce on each interval 4 an affine function Pk{x) = f{xi^) + {x — 
Xk)f'{^k) where is an arbitrary point in 4, and we then have 

\\f-Pk\\L-{k)<N-'e. 

It follows that 

1 ,4 (/) ~ > 1 .4 (Pt ) ~ - e 1 ,4 (/ - W )oo 
>*^i./,(K)~-2^"'e 
= ^N-\\f'{x,)\-£), 

where we have used the triangle inequality 

em,T{f + g)p < em.T{f)p+ein.T{g)p, (2.12) 

Choosing for xj. the point that maximize |/'| on 4 and taking the supremum of the above estimate over all k, we 
obtain 

ei,.%(f)oo>\N-\\\f'U~-£). 
Since e > is arbitrary, this implies the lower estimate 

liminf A'aA,(/)„> Jll/'llz,-. (2.13) 

Combining with the upper estimate ( |2.8[ l, we thus obtain the equality 

lim NON{f)^ = \\\f'\\L'', (2.14) 

for any function f C^. This identity shows that for smooth enough functions, the numerical quantity that 
governs the rate of convergence iV^ ' of uniform piecewise constant approximations is exactly j 1 1 /' 1 1 l~ . 

2.2 Adaptive partitions 

We now consider an adaptive partition ^ for which the intervals 4 may depend on /. In order to understand 
the gain in comparison to uniform partitions, let us consider a function / such that /' e L' , i.e. / e W''' ([0, 1]). 
Remarking that 

max|/(x)-/(y)|< [\f'{t)\dt, 
xjel Ji 

we see that a natural choice fo the 4 can be done by imposing that 

/ \f'{t)\dt=N-' C\f'{t)\dt, 
Jh Jo 
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which means that the L} norm of /' is equidistributed over all intervals. Combining this estimate with \2.l\ , we 
find that for adaptive partitions, 

/ e W^-\[Q, 1]) ^ ONifU < CN-\ (2.15) 

with C := 2 II/'Il^i ■ This improvement upon uniform partitions in terms of approximation properties was firstly 
established in | 35 1. The above argument may be extended to the case where / belongs to the slightly larger space 
BV([0, 1]) which may include discontinuous functions in contrast to IV'''([0, 1]), by asking that the 4 are such 
that 

\f\BV(k)<N-'\f\BV- 

We thus have 

/eSK([0,ll)^CTA,(/)„<CA'-i, (2.16) 

Similar to the case of uniform partitions, the estimate l |2. 16^ is sharp in the sense that a converse result holds: if 
/ is a continuous function such that C7Af(/)oo < CN^^ for some C > 0, then it is necessarily in BV([0, 1]). To see 
this, consider N > and any set of points < xi < X2 < • • • < < 1. We know that there exists a partition ,%/ 
of A' intervals and fy e V^^ such that 1 1 / — ./iv 1 1 l- < CN^ ' . We define a set of points < >■ i <y2-- - < y/n < 1 
by unioning the set of the xj. with the nodes that define the partition ^at, excluding and 1, so that M < 2N. We 
can write 

N-l N-l M-l 

E |/(x,+i)-/(xi)| <2C+ £ \fN{xk+i)-fNM\<2C+ X \fN{yk+i)-fNiyk)\- 

k=0 k=0 k=0 

Since yi; and y/^^i are either contained in one interval or two adjacent intervals of the partition 5^ and since / is 
continuous, we find that \fN{yk+l) ^ ,/iv (>'*)! is either zero or less than 2CN^^ , from which it follows that 

N-l 

E |/(x,+ i)-/(x,)|<6C, 
k=Q 

which shows that / has bounded variation. We have thus proved the following result. 

Theorem 2.2 If f is a continuous function defined on [0, 1] and if (yj\i{f)„ denotes the L°° error of piecewise 
constant approximation on adaptive partitions, we have 

feBV([OA])^(yN{f)oo<CN-\ (2.17) 



In comparison with 1 2.8 1 we thus find that same rate ' is governed by a weaker smoothness condition since /' 
is not assumed to be bounded but only a finite measure. In turn, adaptive partitions may significantly outperform 
uniform partition for a given function /: consider for instance the function f(x) = x" for some < a < 1. 
According to l |2.1l[ ), the convergence rate of uniform approximation for this function is A'^". On the other hand, 
since /'(x) = ax"^' is integrable, we find that the convergence rate of adaptive approximation is A'^'. 

The above construction of an adaptive partition is based on equidistributing the norm of /' or the total 
variation of / on each interval I/,. An alternative is to build in such a way that all local errors are equal, i.e. 

£i,4(/)~ = '7, (2.18) 

for some 7] = J] (A') > independent of k. This new construction of 5^ does not require that / belongs to 
BV{[0, 1]). In the particular case where / e BV{[0, 1]), we obtain that 

N ^ N J 

W'? < E ^1.4(/)~ < ^ E \f\BV{k) < Mv, 
k=l k=l ^ 



from which it immediately follows that 



ei,.^,{f)^ = n<CN-\ 



with C = ^L/^Ibv- We thus have obtained the same error estimate as with the previous construction of ,'7m- 

The basic principle of error equidistribution, which is expressed by \2.\%) in the case of piecewise constant 
approximation in the uniform norm, plays a central role in the derivation of adaptive partitions for piecewise 
polynomial approximation. 

Similar to the case of uniform partitions we can express the optimality of ( |2.15^ by a lower estimate when / 
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is smooth enough. For this purpose, we make a shght restriction on the set ja^V of admissible partitions, assuming 
that the diameter of all intervals decreases as A' — !> +00, according to 

max 14 1 <AN'\ 
46.%. 

for some A > which may be arbitrarily large. Assume that / £ C' ([0, 1]), so that for all e > 0, there exists rj >0 
such that 

\x-y\<ri^\f'{x)-f'{y)\<-. (2.19) 

If A' is such that AW^' < 77, we can introduce on each interval 4 an affine function Pk{-'^) = fi^k) + ^kjf' (^k) 
where xj. is an arbitrary point in 4, and we then have 

\\f-Pk\\L-(h)<N''^>^- 

It follows that 

e 1 ,4 (/)~ > e 1 .4 ( W ) ~ - e I ,/i (/ - )oo 

= kUMt)\dt-N-'e) 
>\{h,\nt)\dt-2N-'E). 

Since there exists at least one interval 4 such that /^^ \ f{t)\dt > A'^' > it follows that 

^i„%(/)~>^^"'(ll,/'llL.-2e). 

This inequality becomes an equality only when all quantities Jj^ \ f'(t)\dt are equal, which justifies the equidistri- 
bution principle for the design of an optimal partition. Since £ > is arbitrary, we have thus obtained the lower 
estimate 

limMNON{f)>^\\f'\\L'- (2.20) 

The restriction on the family of adaptive partitions ja*^ is not so severe since A maybe chosen arbitrarily large. In 
particular, it is easy to prove that the upper estimate is almost preserved in the following sense: for a given / G C' 
and any e > 0, there exists A > depending on £ such that 

limsup NGNif) < IWf'Wo +£, 

These results show that for smooth enough functions, the numerical quantity that governs the rate of convergence 
A'^' of adaptive piecewise constant approximations is exactly ^IL/^'IIl'- Note that H/'Hl- may be substantially 
larger than ||/'||£i even for very smooth functions, in which case adaptive partitions performs at a similar rate as 
uniform partitions, but with a much more favorable multiplicative constant. 

2.3 A greedy refinement algorithm 

The principle of error distribution suggests a simple algorithm for the generation of adaptive partitions, based on 
a greedy refinement algorithm: 

1. Initialization: ,9^ ={[0,1]}. 

2. Given select 4i £ -"^n that maximizes the local error e\ i^ (/)». 

3. Split 4, into two sub-intervals of equal size to obtain and return to step 2. 

The family sif^ of adaptive partitions that are generated by this algorithm is characterized by the restriction that 
all intervals are of the dyadic type 2^.'[n,«+ 1] for some / > and n e {0, • • • ,2.' — 1}. We also note that all such 
partitions S/'f^ may be identified to a finite subtree with A' leaves, picked within an infinite dyadic master tree in 
which each node represents a dyadic interval. The root of ^ corresponds to [0, 1] and each node / of generation 
i corresponds to an interval of length 2^-' which has two children nodes of generation y + 1 corresponding to the 
two halves of /. This identification, which is illustrated on Figure[T] is useful for coding purposes since any such 
subtree can be encoded by IN bits. 

We now want to understand how the approximations generated by adaptive refinement algorithm behave in 
comparison to those associated with the optimal partition. In particular, do we also have that ei,5j,(/)oo < CA'^' 
when /' e L' ? The answer to this question turns out to be negative, but it was proved in [30] that a slight 
strengthening of the smoothness assumption is sufficient to ensure this convergence rate : we instead assume that 
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Figure 1 : Adaptive dyadic partitions identify to dyadic trees 



the maximal function of /' is in L' . We recall that the maximal function of a locally integrable function g is 
defined by 



Mg{x):=sup\B{x,r)\-' \g{t)\dt, 

r>0 JB{x.r) 

It is known that Mj, e LP if and only if g ^ LP for 1 < p < oo and that Mj, e L' if and only if g £ LlogL, i.e. 
/{) ls('^)|log(l + \s{')\)dt < °°, see 1421 . In this sense, the assumption that My/ is integrable is only slightly 
stronger than / e W' ' . 



If ■= ihr" t^n), define the accuracy 



7] := max eu (f)^ 
\<k<N " 



For each k, we denote by 7^ the interval which is the parent of 4 in the refinement process. From the definition 
of the algorithm, we necessarily have 

r? < 11/ Ik" < / 1/(01*- 
For all X e 4, the ball 6(x,2|4|) contains 7^ and it follows therefore that 

Mf{x)>\B{x,2\h\)\-' [ , , |/'(0|df>[4|4r'r?, 

,/b(.v.2|4|) 



which implies in turn 



If Mf is integrable, this yields the estimate 



Mf,{t)dt>T]/A. 



Nr\<A Mf{t)dt. 

It follows that 

ei„5'«(/)~ = '?<C7V-i 
with C = 4\\Mfi Wii . We have thus established the following result. 

Theorem 2.3 If f is a continuous function defined on [0, 1] and if ON{f)'x, denotes the L°° error of piecewise 
constant approximation on adaptive partitions of dyadic type, we have 



Mf,C,L\[QA])^aN(f)„<CN-\ 
and that this rate may be achieved by the above described greedy algorithm. 



(2.21) 



Note however that a converse to l |2.21| l does not hold and that we do not so far know of a simple smoothness 
property that would be exactly equivalent to the rate of approximation A'^' by dyadic adaptive partitions. A 
by-product of l |2.21^ is that 

/ e ^'•"([0, 1]) ^ cta,(/)oo < CN-\ (2.22) 

for any p > 1 . 
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3 Adaptive and isotropic approximation 



We now consider the problem of piecewise polynomial approximation on a domain Q. C R , using adaptive 
and isotropic partitions. We therefore consider a sequence {■-B^n)n>Q of families of partitions that satisfies the 
restriction (n\ . We use piecewise polynomials of degree m — 1 for some fixed but arbitrary m. 

Here and in all the rest of the chapter, we restrict our attention to partitions into geometrically simple elements 
which are either cubes, rectangles or simplices. These simple elements satisfy a property of qffine invariance: 
there exist a reference element R such that any T e ^ e s^f^ is the image of R by an invertible affine transformation 
Aj. We can choose/? to be the unit cube [0, 1]'' or the unit simplex {0 < xi < • • • < < 1} in the case of partitions 
by cubes and rectangles or simplices, respectively. 

3.1 Local estimates 

If r e 5^ is an element and / is a function defined on SI, we study the local approximation error 

em.T{f)p-= min ||/- :^||L/'(r)- (3-23) 
;rePm-i 

When p = 2 the minimizing polynomial is given by 

^ ■= Pmjf, 

where P„ j is the L^-orthogonal projection, and can therefore be computed by solving a least square system. 
When p 7^ 2, the minimizing polynomial is generally not easy to determine. However it is easily seen that the 
L^-orthogonal projection remains an acceptable choice: indeed, it can easily be checked that the operator norm 
of 

Pm.T in LP{T) is bounded by a constant C that only depends on {m, d) but not on the cube or simplex T. From 
this we infer that for all / and T one has 

em,T{f)p < \\f-Pn,.Tf\\D'{T) < ( 1 + C)e,„,r . (3.24) 
Local estimates for e„,j{f)p can be obtained from local estimates on the reference element R, remarking that 

e,n,T{f)p=[j^) e,„,R{g)p, (3.25) 

where g = foAj. Assume that p, T > 1 are such that ^ = j + j< and let g e lV"''^(i?). We know from Sobolev 
embedding that 

where the constant C depends on p, T and R. Accordingly, we obtain 

em.R{8}p<C rnin ||g - ;;r||^,„.r(fi) . (3.26) 

We then invoke Deny-Lions theorem which states that if is a connected domain, there exists a constant C that 
only depends on m and R such that 

min \\g-ny,n..^R)<C\g\w-r{R). (3.27) 

m-1 

If ^ = foAj, we obtain by this change of variable that 

|g|v^«(«) <c(^)'^'||firin/lw-(r)> (3-28) 

where Bj is the linear part of Aj and C is a constant that only depends on m and d. A well known and easy to 
derive bound for HBt- || is 

||firll<— , (3.29) 

PR 

Combining ( |3.25^ , l |3.26^ , l |3.27[ ), l |3.28[ > and j3.29[ l, we thus obtain a local estimate of the form 

<c|r|i/''-i/^/.'f |/|^,,„..(^) =c|rr"V''A'^|/|^,.^ 

where we have used the relation ^ = j, + '-j- From the isotropy restriction ( jl.2| , there exists a constant C > 
independent of T such that < C|r|. We have thus established the following local error estimate. 
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Theorem 3.1 Iff e we have for all element T 

em,r(/)p<C|/|„.„.r(r), (3.30) 

where the constant C only depends on m, R and the constants in \1.2Y 

Let us mention several useful generalizations of the local estimate l |3.30[ l that can be obtained by a similar 
approach based on a change of variable on the reference element. First, if / £ for some Q < s <m and 

T > 1 such that = ^ + I J we have 

e,„j{f)p<C\f\^,,ijy (3.31) 
Recall that when i- is not an integer, the W*-^ semi-norm is defined by 



i3«/(x)-3«/(y) r. 



where n is the largest integer below s. In the more general case where ^ < ^ + f i we obtain an estimate that 
depends on the diameter of T: 

em.T{f)p < C/^5■|/|^v«(7■), r := - +^ > 0. (3.32) 

P 

Finally, remark that for a fixed p> I and s, the index T defined by | = ^ + | may be smaller than 1 , in which 
case the Sobolev space is not well defined. The local estimate remain valid if is replaced by the 

Besov space zi^)- This space consists of all / e L^(Q.) functions such that 

\f\Bl, ■= \\<^k{f,-)r\\[^r(^[o^^^ diy 

is finite. Here k is the smallest integer above s and (Ok{f,t)x denotes the L^-modulus of smoothness of order k 
defined by 

ak{f,t)t ■■= sup t|A^/||z.T, 
\h\<t 

where A/,/ := /(• + h) — /(•) is the usual difference operator. The space B'l ^ describes functions which have "j 
derivatives in L^" in a very similar way as IV*'^. In particular it is known that these two spaces coincide when 
T > 1 and i is not an integer. We refer to 1291 and 1181 for more details on Besov spaces and their characterization 
by approximation procedures. For all p,x > and < s <ni such that ^ < ^ + f > a local estimate generalizing 
l |3.32| l has the form 

em,T{f)p<Cl/T\f\B'^^(T], r:=--^+s>0. (3.33) 

P T 

3.2 Global estimates 

We now turn our local estimates into global estimates, recalling that 



e,n..^{f)p ■= min Wf-gWv' = ( E 



with the usual modification when p = oo. We apply the principle of error equidistribution assuming that the 
partition ^ is built in such way that 

em.T{f)p = il, (3.34) 
for all T e where A' = N{ri). A first immediate estimate for the global error is therefore 



em.^Af)p<N''''n- (3-35) 
Assume now that / e W'"-''^{Q.) with T > 1 such that ^ = ^ + 7 - It then follows from Theorem : 



3.1 



that 



Nil'< E e,„j{f)l<C X \f\'w'-HT)=C\f\w' 



Combining with 1 3.35 i and using the relation ^ = ^ + ^. we have thus obtained that for adaptive partitions 



p 

built according to tne error equidistribution, we have 



..^,{f)p<CN-"''''\f\w--r. (3.36) 



10 



By using • 3.31 1, we obtain in a similar manner that if < i < m and T > 1 are such that ^ = ^ + f ^ then 

em,Mf)p<CN--'''\f\w'-- (3-37) 

Similar results hold when T < 1 with IV*'^ replaced by 6^ but their proof requires a bit more work due to the 
fact that l/IJ,, is not sub-additive with respect to the union of sets. We also reach similar estimate in the case 
p = oo by a standard modification of the argument. 

The estimate l |3.36[ > suggests that for piecewise polynomial approximation on adaptive and isotropic partitions, 
we have 

(yN{f)p<CN-'"/''\f\w'--,, ^ = ^ + ^- (3-38) 

Such an estimate should be compared to l |1.4^ , in a similar way as we compared l |2.17[ ( with l |2.8^ in the one 
dimensional case: the same same rate N^"'!" is governed by a weaker smoothness condition. 

In contrast to the one dimensional case, however, we cannot easily prove the validity of j3.38[ > since it is not 
obvious that there exists a partition ^ e jz^j which equidistributes the error in the sense of ( |3.34^ . It should be 
remarked that the derivation of estimates such as j3.36| l does not require a strict equidistribution of the error. It is 
for instance sufficient to assume that em.r(/)p < i] for all T e 5iv, and that 

cin < e„,j{f)p, 

for at least C2N elements of .3^, where cj and C2 are fixed constants. Nevertheless, the construction of a partition 
5]v satisfying such prescriptions still appears as a difficult task both from a theoretical and algorithmical point of 
view. 



3.3 An isotropic greedy refinement algorithm 

We now discuss a simple adaptive refinement algorithm which emulates error equidistribution, similar to the 
algorithm which was discussed in the one dimensional case. For this purpose, we first build a hierarchy of nested 
quasi-uniform partitions where S^o is a coarse triangulation and where i^j+i is obtained from S>j by 

splitting each of its elements into a fixed number K of children. We therefore have 

#{S)j)=Kj#{&o), 

and since the partitions are assumed to be quasi-uniform, there exists two constants < cj < C2 such that 

cxK'jl'' <hT <C2K'^''^, (3.39) 

for all T e and j > 0. For example, in the case of two dimensional triangulations, we may choose A" = 4 by 
splitting each triangle into 4 similar triangles by the midpoint rule, or A" = 2 by bisecting each triangle from one 
vertex to the midpoint of the opposite edge according to a prescribed rule in order to preserve isotropy. Specific 
rules which have been extensively studied are bisection from the most recently generated vertex 1 8 1 or towards 
the longest edge |41 1. In the case of partitions by rectangles, we may preserve isotropy by splitting each rectangle 
into 4 similar rectangles by the midpoint rule. 
The refinement algorithm reads as follows: 

1. Initialization: 5iv„ = &o with Nq := #{%). 

2. Given ^ select T e 5^ that maximizes e^.Tif)!- 

3. Split T into its K childrens to obtain .%r+K-\ and return to step 2. 

Similar to the one dimensional case, the adaptive partitions that are generated by this algorithm are restricted to a 
particular family where each element T is picked within an infinite dyadic master tree ^ = Llj>Q&j which roots 
are given by the elements S^q. The partition may be identified to a finite subtree of ^ with A' leaves. Figure 
|2]displays an example of adaptively refined partitions either based on longest edge bisection for triangles, or by 
quad-split for squares. 

This algorithm cannot exactly achieve error equidistribution, but our next result reveals that it generates par- 
titions that yield error estimates almost similar to l |3.36[ l. 

Theorem 3.2 IffeW"'''^{Q.)forsomeT> 1 such that ^ < ^ + ™, we then have for all N >2Nq = 2#{&q), 

em,.%{f)p<CN-'"l''\f\w«:^, (3.40) 
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Figure 2: Adaptively refined partitions based on longest edge bisection (left) or quad-split (right) 



where C depends on T, m, K, R and the choice of &q. We therefore have for piecewise polynomial approximation 
on adaptively refined partitions 

ON{f)p<CN-"'l''\f\wn.-^, ->- + '". (3.41) 
t p d 

Proof: The technique used for proving this result is adapted from the proof of a similar result for tree-structured 
wavelet approximation in 1191 . We define 

ri := max e,„j{f)p, (3.42) 

so that we obviously have when p <°°, 

e,n,.%if)p<N^^Pr]. (3.43) 

For T e \ S>{), we denote by P{T) its parent in the refinement process. From the definition of the algorithm, 
we necessarily have 

n<e„i.P(T){f)p^ 
and therefore, using l |3.32[ l with s = m, we obtain 

n<Ch'p^T)\f\w'-{P{T)), (3-44) 

with r := ^ — ^ + m > 0. We next denote by := n &j the elements of generation j in ^ and define 
Nj :=#{ f^Nj ) • We estimate Nj by taking the T power of j3.44 1 and summing over ^7^,] which gives 

NjT]'' < C'^l7-e.%., ''p'(7-)l/lw-.'np(7-)) 
</fC^(supreS*^_,/>?-^)|/|^... 



Using 1 3.39 ' and the fact that #(^/) = A^qA"^, we thus obtain 

Nj < min{C77-^/f->"/''|/|^„ , 

We now evaluate 

W-M, = I < I min{CT]-^^^->"/^|/|^,., , NoKi}. 
By introducing 70 the smallest integer such that Ct]^'^ K^^'"^l''\f\'^,,^ < NqKK we find that 

j<jo J>h 

which after evaluation of jo yields 
and therefore, assuming that N > 2No, 



- l/p—m/d 

iced resu 

argument leads to a similar conclusion. □ 



ri<CN-'"'-"''''\f\w-^. 

Combining this estimate with ( |3.43^ gives the announced result. In the case p = 00, a standard modification of the 
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Figure 3: Conforming refinement (left) and graded refinement (right) 



Remark 3.3 By similar arguments, we obtain that if f (E W*'^(f2) for some T > 1 and Q < s < m such that 



v.%(/)p<civ--'/''|/|iv-. 

The restriction T > 1 inay be dropped if we replace W^'"^ by the Besov space B^ ^, at the price of a more technical 
proof. 

Remark 3.4 The same approximation residts can be obtained if we replace e,„j {f)p in the refinement algorithm 
by the more computable quantity \\f — Pm.TfWii'iT)' '^"^ equivalence { 3.24^ . 



Remark 3.5 The greedy refinement algorithm defines a particular sequence of subtrees ,%j of the master tree 
but S/'t^ is not ensured to be the best choice in the sense of minimizing the approximation error among all 
subtrees of cardinality at most N. The selection of an optimal tree can be performed by an additional pruning 
strategy after enough refinement has been performed. This approach was developped in the context of statistical 
estimation under the acronyme CART ( classification and regression tree ), see ii2. ,32J . Another approach that 
builds a near optimal subtree only based on refinement was proposed in ^7^. 

Remark 3.6 The partitions which are built by the greedy refinement algorithm are non-conforming. Additional 
refinement steps are needed when the users insists on conformity, for instance when solving PDE's. For .specific 
refinement procedures, it is possible to bound the total number of elements that are due to additional conforming 
refinement by the total number of triangles T which have been refined due to the fact that Cmjif)! ^os the 
largest at some stage of the algorithm, up to a fixed multiplicative constant. In turn, the convergence rate is 
left unchanged compared to the original non-conforming algorithm. This fact was proved in [8] for adaptive 
triangulations built by the rule of newest vertex bisection. A closely related concept is the amount of additional 
elements which are needed in order to impose that the partition satisfies a grading property, in the sense that 
two adjacent elements may only differ by one refinement level. For specific partitions, it was proved in ^23^ that 
this amount is bounded up to a fixed multiplicative constant the number of elements contained in the non-graded 
partitions. Figure^^displays the conforming and graded partitions obtained by the minimal amount of additional 
refinement from the partitions of Figure^ 

The refinement algorithm may also be applied to discretized data, such as numerical images. The approx- 
imated 512 X 512 image is displayed on Figure |4] together with its approximation obtained by the refinement 
algorithm based on newest vertex bisection and the error measured in L?, using A' = 2000 triangles. In this 
case, / has the form of a discrete array of pixels, and the L^(r)-orthogonal projection is replaced by the £^{St)- 
orthogonal projection, where St is the set of pixels with centers contained in T. The use of adaptive isotropic 
partitions has strong similarity with wavelet thresholding I28II18I . In particular, it results in ringing artifacts near 
the edges. 



3.4 The case of smooth functions. 

Although the estimate l |3.38[ ) might not be achievable for a general / e W""'^(fl), we can show that for smooth 

enough /, the numerical quantity that governs the rate of convergence A'^^ is exactly |/|ty™.r := ^L|a|=m 

that we may define as so even for T < 1. For this purpose, we assume that / e C"'{Q.). Our analysis is based on 

the fact that such a function can be locally approximated by a polynomial of degree m. 

We first study in more detail the approximation error on a function q e Pm. We denote by H,,, the space of 
homogeneous polynomials of degree m. To ^ £ Pm, we associate its homogeneous part q e H,,,, which is such 
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Figure 4: The image 'peppers" (left) and its approximation by 2000 isoti'opic triangles obtained by the greedy 
algorithm (right). 



that 

We denote by the coefficient of q associated to the multi-index a = (ai , ■ ■ • , 0!j) with | a| =m. We thus have 

em,T{l)p = em,7-(q)p- 

Using the affine transformation Aj which maps the reference element R onto T, and denoting by Bj its linear 
part, we can write 

/irKV/' /|r|\i/z' 

em,7-(q)p = [j^ ) eRj„(qoAT)p = l^j^ j em,«(q)p, q := qofir e H„, 
where we have used the fact that q — q oAj e Vm- 1 • Introducing for any r > the quasi-norm on H,„ 



one easily checks that 



I |q«l'' 

\a\=m 



C-'||Sr'ir'"|q|.<|q|,-<C||Srinq|o 



for some constant C > that only depends on m, r and R. We then remark that eR.;,!(q)p is a norm on H^, which 
is equivalent to |q|,- since H,,, is finite dimensional. It follows that there exists constants < Ci < C2 such that 
for all q and T 

Cm''nBT'\\-'"h\r<e„rAq)p<Cl\T\'IP\\BTrHr- 

Finally, using the bound l |3.29[ ) for \\Bj\\ and its symmetrical counterpart 

\\B-A<'f, 

Pt 



together with the isotropy restriction 1 1.2 1, we obtain with ^ ■= j, + j the equivalence 

Ci\T\'\q\r<e„,T{q)p<C2\T\'\q\n 

where C\ and C2 only depend on m, R and the constant C in ^l.2\ . Choosing r = t this equivalence can be 
rewritten as 

Cl[2l llq«III^(7-)j <em.T{q)p<C2[l^ l|q«IIIr(7-)j . 
|a|=m |o!|=m 

Using shorter notations, this is summarized by the following result. 

Lemma 3.7 Let p > i and ^ := ^ + 5. There exists constant Ci and C2 that only depends on m, R and the 
constant C in \1.2\ such that 



C\\q\w'-'(T) < em.T{q)p < C2kliv"-i(r), (3-45) 

for all q e Pm- 
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In what follows, we shall frequently identify the m-th order derivatives of a function / at some point x with an 
homogeneous polynomial of degree m. In particular we write 

\a\=m 

We first establish a lower estimate on <JN{f), which reflects the saturation rate A'^'"/'' of the method, under a 
slight restriction on the set i<^5v of admissible partitions, assuming that the diameter of all elements decreases as 
JV —> 4-00, according to 

maxliT <AN'^^'', (3.46) 

for some A > which may be arbitrarily large. 

Theorem 3.8 Under the restriction j3.46[ l, there exists a constant c > that only depends on m, R and the 
constant C in ( |_/.2| ) such that 

limmf N"'l''GN{f)p > c\f\w'-.^ (3.47) 

for all f e C'"{a), where ^ := ^ + 7- 

Proof: If / G C™ (n) and x e f2, we denote by the Taylor polynomial of order m at the point x= (xi , • • • , ) : 

\a\<m 

If ,%i is a partition in s^f^, we may write for each element T e ^ and x £ T 

em,T{f)p > e,nj{qx)p - \\.f-qx\\v'{T) 

> Cl\qx\w"'-UT) - ll/-?^!lLf(r) 

> c\f\^„„.z^J) -Cilf-qxlw'd) - \\f-qA\u'{T)j 

with c :=Ci min{ 1 , t}, where we have used the lower bound in ( |3.45| and the quasi-triangle inequality 

||!( + v||z.. <max{l,-r-i}(||i<||L. + ||v||z.T). 

By the continuity of the m-th order derivative of /, we are ensured that for all e > there exists 5 > such that 

<S^ l/(.v)-g.v(y)l < £\^-y\'" and \d"'f{y)-d"'q,U < £■ (3.49) 

Therefore if A' > A'o such that AA'q"''''' < 5, we have 

e,„,T {f)p > - (Ci e|r| 1/ V ehf\T\^/P) 

>c|/k»...(r)-(l+Ci)e/.r'^'' 
>c|/|w»...(^)-CeiV-l/^ 

where the constant C depends on Ci in (|3.45^ and A in j3.46|l. Using triangle inequality, it follows that 



..%(/)p=(E <^n,T{f)f^'^">c['£ \f\'^„„,^^^y'" -CeN 



e, 



—m/d 



Using Holder's inequality, we find that 



l/k"" = ( E <^"''\ E ' (3-50) 



which combined with the previous estimates shows that 

^'"/''^,«,.^„(/)p>ci/k--Ce. 
Since e > is arbitrary this concludes the proof. □ 



Remark 3.9 The Holder's inequality \3.50\ becomes an equality if and only if all quantities in the sum are equal, 
which justifies the error equidistribution principle since these quantities are approximations of emj{f)p. 

We next show that if / e C™(i2), the adaptive approximations obtained by the greedy refinement algorithm 
introduced in |3.3|satisfy an upper estimate which closely matches the lower estimate (|3.47|l. 
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Theorem 3.10 There exists a constant C that only depends on m, R and on the choice of the hierarchy (&j)j>o 
such that for all f £ C"'{Q.), the partitions ,3^ obtained by the greedy algorithm satisfy. 

limsup ^'"/''e,„„5^(/);, < C|/k™.r, (3.51) 

where ^ ■= ^ + ^- In turn, for adoptively refined partitions, we have 

limsup N"'l'^aN{f)p < C\f\w:,.,, (3.52) 

for all f eC"'{il). 



Proof: For any e > 0, we choose 5 > such that 1 3.49 1 holds. We first remark that there exists N{S) sufficiently 
large such that for any A' > N{S) at least N/2 elements T e ^ have parents with diameter hpf^j-^ < 8. Indeed, 
the uniform isotropy of the elements ensures that 

\T\>ch'l,^T)' 

for some fixed constant c > 0. We thus have 

#{re.^^; V(r)>5}< 

and the right-hand side is less than N/2 for la rge enough A'. We denote by ^ the subset of T e such that 
hp(^T) — ^- Defining Tj as previously by 1 3.42 1, we observe that for all T e ^ \ we have 

T1<e,n.P(T)if)p- (3-53) 
If X is any point contained in T and the Taylor polynomial of / at this point defined by j3.48[ >, we have 

^m,P(T){f)p < ('m,P{T){lx)p + \\f - qx\\u'(P(T)) 

<C2k.|^»...(P(r))+eA?(r)|/'(r)|i/'' 

<C2(\^f\q,U„,.^j)+eh'" \P{T)\'IP 



\T\ 

where C2 is the constant appearing in ( |3.45| and D2 '■= C2max{l, 1/t}. Combining this with l |3.53^ , we obtain 
that for all T e .%/, 

77<D(|/|;^,„..(^,+e|r|i/^) 

where the constant D depends on C2, m and on the refinement rule defining the hierarchy Elevating to 

the power T and summing on all T e ,%/, we thus obtain 

(yV/2-Wo)T]^<max{l,T}/)^(|/|^„,.+e^|n|), 



where Nq := #{S^q). Combining with 1 3.43 ; we therefore obtain 

<^,«.,%a)p<Omax{Ti,l/t}iVl/"W2-^o)-''''(l/k-+e|^l|'/'). 



Taking A' > 4A'o and remarking that e > Ois arbitrary, we conclude that (3.52 > holds with C = 4'/^£)max{T t , 1/t}. 

□ 

Theorems |3 . 8| and |3 . 1 Q| reveal that for smooth enough functions, the numerical quantity that governs the rate of 
convergence N^'"/'^ in the L'' norm of piecewise polynomial approximations on adaptive isotropic partitions is 
exactly In a similar way one would obtain that the same rate for quasi-uniform partitions is governed 

by the quantity Note however that these results are of asymptotic nature since they involve limsup and 

liminf as A' — > +00, in contrast to Theorem |3.2| The results dealing with piecewise polynomial approximation on 
anisotropic adaptive partitions that we present in the next sections are of a similar asymptotic nature. 
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4 Anisotropic piecewise constant approximation on rectangles 



We first explore a simple case of adaptive approximation on anisotropic partitions in two space dimensions. More 
precisely, we consider piecewise constant approximation in the W norm on adaptive partitions by rectangles with 
sides parallel to the x and y axes. In order to build such partitions, i2 cannot be any polygonal domain, and for the 
sake of simplicity we fix it to be the unit square: 

fl= [0,1]2. 

The family s^i^ consists therefore of all partitions of £1 of at most rectangles of the form 

r = / X 7, 

where / and J are intervals contained in [0, 1]. This type of adaptive anisotropic partitions suffers from a strong 
coordinate bias due to the special role of the x and y direction: functions with sharp transitions on line edges 
are better approximated when these eges are parallel to the x and y axes. We shall remedy this defect in §5 
by considering adaptive piecewise polynomial approximation on anisotropic partitions consisting of triangles, 
or simplices in higher dimension. Nevertheless, this first simple example is already instructive. In particular, 
it reveals that the numerical quantity governing the rate of approximation has an inherent non-linear structure. 
Throughout this section, we assume that /belongs to C' ([0, 1]^). 



4.1 A heuristic estimate 

We first establish an error estimate which is based on the heuristic assumption that the partition is sufficiently 
fine so that we may consider that V/ is constant on each T, or equivalently / coincides with an affine function 
qr £ Pi on each T. We thus first study the local V approximation error onT =1 x J for an affine function of the 
form 

q{x, y)=gQ + q^x + qyy. 
Denoting by <i{x,y) := qxX-\-qyy the homogeneous linear part of q, we first remark that 

e\.T{q)p = ei,T{(i)p, (4.54) 

since q and q differ by a constant. We thus concentrate on ei.7-(l)/; ™d discuss the shape of T that minimizes 
this error when the area |r| = 1 is prescribed. We associate to this optimization problem a function Kp that acts 
on the space of linear functions according to 

/fp(q) = inf ei.Hq)p. (4.55) 
|r|=i 

As we shall explain further, the above infimum may or may not be attained. 

We start by some observations that can be derived by elementary change of variable. If a + T is a translation 
of r, then 

ei,fl+7-(q)p = eLr(q)p (4.56) 



since q and q(- — a) differ by a constant. Therefore, if T is a minimizing rectangle in ( 4.55 >, then a + T is also 
one. If hT is a dilation of T, then 

ex,hT{<\)p = h^'''^^e,j{q)p (4.57) 
Therefore, if we are interested in minimizing the error for an area |r| = A, we find that 

inf eij{q)p =Al/%(q), - := - + ^ (4.58) 

and the minimizing rectangles for (j438j are obtained by rescaling the minimizing rectangles for l |4.55^ . 

In order to compute Kp{q), we thus consider a rectangle T = I x J of unit area which barycenter is the origin. 
In the case p = <x>, using the notation X := |/|/2 and K := |gy| |y|/2, we obtain 

ei,r(q)c»=X + K. 

We are thus interested in the minimization of the function X -\-Y under the constraint XY = \qxqy\/'X. Elementary 
computations show that when qxqy / 0, the infimum is attained when X = Y = j \J\qyqx\ which yields 

^ and |y| : 
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Note that the optimal aspect ratio is given by the simple relation 



(4.59) 



which expresses the intuitive fact that the refinement should be more pronounced in the direction where the 
function varies the most. Computing ei.j-(^)oo for such an optimized rectangle, we find that 



In the case p = 2, we find that 



ei,7-(q)2 



(4.60) 



= .^-'1/1/2 i'-n'/l/a^??-^ + ^yy^ + Mxqyxy)dydx 
= |(^2(|/|/2)3|y|/2+^2(|y|/2)3|/|/2) 

= i(X2 + F2), 

where we have used the fact that |/| |/| = 1. We now want to minimize the function under the constraint 

XY = |^j;^y|/4. Elementary computations again show that when q^qy 7^ 0, the infimum is again attained when 
X = y = 2 \/\<ly1x\, and therefore leads to the same aspect ratio given by 1 4.59 1, and the value 



V6 



qxqyl 



(4.61) 



For other values of p the computation of eij(q)p is more tedious, but leads to a same conclusion: the optimal 
aspect ratio is given by (4.59 1 and the function Kp has the general form 



Kp{(i) = CpJ\q,,qy\, 



(4.62) 



with Cn 



1//' 



Note that the optimal shape of T does not depend on the metric in which we 
measure the error. 

By l |4.54[ l, l |4.56[ l and l |4.57[ l, we find that for shape-optimized triangles of arbitrary area, the error is given by 



ei,T{q)p = \T\'/'Kp{q)p=CpJ\qyq,\\T 



ip+l){p+2) 



(4.63) 

Note that Cp is uniformly bounded for all p > 1 . 

In the case where q^O but q^qy = 0, the infimum in 1 4.55| is not attained, and the rectangles of a minimizing 
sequence tend to become infinitely long in the direction where q is constant. We ignore at the moment this 
degenerate case. 

Since we have assumed that / coincides with an affine function on T, the estimate l |4.63[ l yields 



e\.T{f)p = Cp 



^\dxfdyf\ 



U{T) 



= \\Kp(^.f)\\u, -■■= 



(4.64) 



where we have identifed V/ to the linear function {x,y) xd^f + ydyf. This local estimate should be compared 
to those which were discussed in §3.1 for isotropic elements: in the bidimensional case, the estimate j3.30[ l of 
Theorem l3.1l can be restated as 

eiT(/)p<C||V/|b(r), \ -=^- + \- 



The improvement in 1 4.64 1 comes the fact that \/\dxfdyf\ may be substantially smaller than |V/| when \dxf\ 
and |<9y/| have different order of magnitude which reflects an anisotropic behaviour for the x and y directions. 
However, let us keep in mind that the validity of l |4.64[ ) is only when / is identified to an affine function on T . 

Assume now that the partition ^ is built in such a way that all rectangles have optimal shape in the above 
described sense, and obeys in addition the error equidistribution principle, which by ( |4.64^ means that 



Then, we have on the one hand that 



ei,.%{f)p<nN'ip, 
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and on the other hand, that 



N^'<\\Kp{Vf)\\l,. 



Combining the two above, and using the relation ^ + 2> we thus obtain the error estimate 



On 



(/),<iV-'/2||/f,(V/)|| 



(4.65) 



This estimate should be compared with those which were discussed in §3.2 for adaptive partition with isotropic el- 
ements: for piecewise constant functions on adaptive isotropic partitions in the two dimensional case, the estimate 
p.38| l can be restated as 

ON{f)p<CN-'l^\\Vf\\u, \ = \ + \- 

As already observed for local estimates, the improvement in l |4.64[ ) comes from the fact that |V/| is replaced by 
the possibly much smaller \/\dxfdyf\. It is interesting to note that the quantity 



A„{f):=\\Kp{Vf)\\, 



-C„ 



^\djdyf\ 



is strongly nonlinear in the sense that it does not satisfy for any / and g an inequality of the type Ap{f + g) < 
C{Ap{f) +Ap{g)), even with C > 1. This reflects the fact that two functions / and g may be well approximated 
by piecewise constants on anisotropic rectangular partitions while their sum / + g may not be. 



4.2 A rigourous estimate 

We have used heuristic arguments to derive the estimate l |4.65[ l, and a simple example shows that this estimate 
cannot hold as such: if / is a non-constant function that only depends on the variable x or y, the quantity Ap{f) 
vanishes while the error Oiv(/);) may be non-zero. In this section, we prove a valid estimate by a rigourous 
derivation. The price to pay is in the asymptotic nature of the new estimate, which has a form similar to those 
obtained in §3.4. 

We first introduce a "tamed" variant of the function Kp, in which we restrict the search of the infimum to 
rectangles of limited diameter. For M > 0, we define 

Kp.M{(l)= mm eij{q)p. (4.66) 

\T\ = \Mt<M 

In contrast to the definition of Kp, the above minimum is always attained, due to the compactness in the Hausdorff 
distance of the set of rectangles of area 1, diameter less or equal to M, and centered at the origin. It is also not 
difficult to check that the functions q i-^ ei j{q)p are uniformly Lipschitz continuous for all T of area 1 and 
diameter less than M: there exists a constant Cm such that 

ki,7-(q)p-ei,r(q)pl <CM|q-q|, (4.67) 

where |q| := + In turn Kp M is also Lipschitz continuous with constant Cm- Finally, it is obvious that 

■S'p,M(q) -s- Kp{q) as M ^ +oo. 

If / is a C' function, we denote by 

C0{S):= max |V/(z) - V/(z')|, 
|z-z'|<d 

the modulus of continuity of V/, which satisfies limg^o = 0- We also define for all z e 

qziz')=f{z) + vf-{z'-z), 

the Taylor polynomial of order 1 at z- We identify its linear part to the gradient of / at z: 

q," = v/(z). 

We thus have 

\fiz')-qz{z')\<\z-z'\m{\z-z'\). 

At each point z, we denote by Tm{z) a rectangle of area 1 which is shape-optimized with respect to the gradient 
of / at z in the sense that it solves ( |4.66^ with q = qj. The following results gives an estimate of the local error 
for / for such optimized triangles. 
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Lemma 4.1 Let T = a + hTMiz) be a reseated and shifted version ofTM(z)- We then have for any z' 
eijif)p<{KpM{q,')+BM(o{mRx{\z-z'\,hT}))\T\^^\ 

wilhBM ■■= 2Cm + M. 



Proof: For all z, z' £ O., we have 



ei,r„ (qf ) < ei.r„ (qz) + CM\qz- (Ij I 
= Kp^Mi(lz)+CM\qz-(k.'\ 
< Kp^M iHz' ) + '2-Cm I qz - fhl I 
<Kp^M{aLz') + '^CM(Si{\z-zl\). 



We then observe that if z' e T 



e\.T{f)p <eij{(\z:) + \\f-qA\L''{T) 

<eijM\T\'l' + \\f-qz!\\L-(T)\T\"'' 
<{Kpj^{^,.) + 2CMC0{\z-z!\))\T\'l' + hTa{hT)\T\'IP 
<{KpM{^:) + 2CMa{\z-z!\)+Ma{hT))\T\'l\ 

which concludes the proof. □ 
We are now ready to state our main convergence theorem. 

Theorem 4.2 For piecewise constant approximation on adaptive anisotropic partitions on rectangles, we have 

limsup N^I^(JNLf)p < \\Kp{Vf)\\L.. (4.68) 

/orfl///ec'([0,l]2). 

Proof: We first fix some number 5 > and M > that are later pushed towards and +oo respectively. We 
define a uniform partition ^ of [0, 1] into squares 5 of diameter hs < S, for example by jo iterations of uniform 
dyadic refinement, where 70 is chosen large enough such that 2^-'°+'''^ < S. We then build partitions ,%[ by 
further decomposing the square elements of ,5^ in an anisotropic way. For each 5 e 5^, we pick an arbitrary 
point zs e 5 (for example the barycenter of 5) and consider the Taylor polynomial q,^ of degree 1 of / at this 
point. We denote by = Tm{({zs) rectangle of area 1 such that, 

ei,7i(qzs)p = , min eij{q,^)p = Kp^M{qzs)- 

\T\ = l,hT<M 

For h > 0, we rescale this rectangle according to 

Th,s = h{Kp^M{(izs) + {Bm+Cm)(o{S) + 8)-^I^Ts. 

and we define S^h s the tiling of the plane by T/, 5 and its translates. We assume that hCA < 5 so that hj < 5 for 
all r G ^ii s and all S. Finally, we define the partition 

^ = {rn5; re ^,,sand5e 

We first estimate the local approximation error. By lemma l|4.ll, we obtain that for all T £ ,3^ 5 and z' £ T n 5 



e\,Tns[f)p <e\.T(f)p 

<(Kp^M{({z')+BM(o{d))\T\^l' 

<h^l'{Kp^M(.0iz,) + {BM+CM)a{S)){Kp,M{fizs) + {BM+CMMS) + 5)-^ 

<h^l^ 

The rescaling has therefore the effect of equidistributing the error on all rectangles of and the global approx- 
imation error is bounded by 

ei,Mp<N''''h^'' (4-69) 
We next estimate the number of rectangles A' = #{£^ff), which behaves like 

= (1 + r]{h))h-^ Zse^n \S\{Kp.M{(lzs) + {Bm + Cm)co{S) + Sy 
= (1 + Tl{h))h-^Zse.n IsiKpMfizs) + {Bm + CmM5) + 8Y, 
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where 77 (/i) ^ as /; — > 0. Recalling that Kp^njiflzs) Lipschitz continuous with constant Cm, it follows that 



(4.70) 



N<{l + T^{h))h-^ [ {Kp,M{<l,) + {BM + 2CM)co{S) + 5y. 
Jn 

Combining ( |4.69^ and l |4.70[ (, we have thus obtained 

N^^^ei^,3r,{f)p<{l + ri{h)y/^\\Kp,M{qz) + {BM + 2CM)co{8) + 5Ur. 
Observing that for all e > 0, we can choose M large enough and 8 and h small enough so that 
{l + ri{h)y^^\\KpM{qz) + {BM + 2CM)o}{S) + S\\L^ < ||A'„.M(q.)llL' +£, 

this concludes the proof. □ 
In a similar way as in Theorem 3.8 we can establish a lower estimate on C!v(/), which reflects the saturation rate 
N^^l'^ of the method, and shows that the numerical quantity that governs this rate is exactly equal to ||A'p(V/)||£T. 
We again impose a slight restriction on the set i2*5v of admissible partitions, assuming that the diameter of all 
elements decreases as N ^ +0°, according to 

max/jj- <AiV~'/^ (4.71) 

for some A > which may be arbitrarily large. 
Theorem 4.3 Under the restriction l |4.71[ l, we have 

Ximmi N^I^(JN{f )p > ll^p(V/) IIl' (4.72) 

for all feC^ {Q.), where + 

Proof: We assume here /? < 00. The case p = 00 can be treated by a simple modification of the argument. Here, 
we need a lower estimate for the local approximation error, which is a counterpart to Lemma |4?T] We start by 
remarking that for all rectangle T e fl and z e T, we have 

\ei.T{f)p-ei,T{qz)p\ < \\f-qz\\D'{T) < \T\'/PhT(0{hT), 

and therefore 

eijif)p>eij{q,)p-\T\^^PhT(0{hT)>Kp{q,)\T\^'''-\T\^^"hTC0{hT) 
Then, using the fact that if {a,b,c) are positive numbers such that a>b — c one has aP >bP — pcb^^ ' , we find 
that 

exAfYp >Kp{vim""-pKp{^z)P-'\T\'^P-''^"\T\yPhTm{hT) 
= Kp{(i^P\T\^+Pl^ - pKp{(i^P-^\T\^+iP-'^)l^hTa{hT), 
Defining C := pmax,g£2^p(lz)'' ' and remarking that < hP^^ , this leads to the estimate 

e,.T{fYp>Kp{q,)P\T\'+Pl^-Ch'^\T\a{hT). 
Since we work under the assumption l |4.71[ l, we can rewrite this estimate as 

e,j{f)Pp>Kp{i^,)P\T\'+Pl^-C\T\N-Pl^e{N), (4.73) 
where e(A') — > as W — > 00. Integrating 1 4.73 1 over T, gives 

eiAf)';, > l^iKp{q,)P\T\P/^-CN-P/h{N))dz. 
Summing over all rectangles T e ^ and denoting by T~ the triangle that contains z, we thus obtain 

ei..^Afrp> j^Kp{Vf{z))P\T,\Pl^dz-C\£l\N-Pl^£{N). (4.74) 



Using Holder inequality, we find that 



T,\-'dz] . (4.75) 



f^Kp(Vf{z)Ydz< [f^Kp{Vf{z))P\UPl^d^:f '[f^ 

Since \T-,\-'^dz = #{.%/) = A', it follows that 

eK^ifYp > \\Kp{S/f)\\P,N-P/^-C\a\N-P^h{N), 
which concludes the proof. □ 



Remark 4.4 The Holder inequality \4. 75\ which is used in the above proof becomes an equality when the quan- 
tity Kp(y f(z))P\T^\P^'^ and |7^|~' are proportional, i.e. KpiV f{z))\T\'^l'^ is constant, which again reflects the 
principle of error equidistribution. In summary, the optimal partitions should combine this principe with locally 
optimized shapes for each element. 
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5 Anisotropic piecewise polynomial approximation 



We turn to adaptive piecewise polynomial approximation on anisotropic partitions consisting of triangles, or 
simplices in higher dimension. Here C R"' is a domain that can be decomposed into such partitions, therefore a 
polygon when d = 2, a polyhedron when d = 3, etc. The family consists therefore of all partitions of Q. of at 
most N simplices. The first estimates of the form jl.6[ > were rigorously established in f 171 and f5l in the case of 
piecewise linear element for bidimensional triangulations. Generalization to higher polynomial degree as well as 
higher dimensions were recently proposed in ll4l|15|[T6l as well as in |39|. Here we follow the general approach 
of ||39J to the characterization of optimal partitions. 

5.1 The shape function 

If / belongs to C'"(fl), where m — 1 is the degree of the piecewise polynomials that we use for approximation, 
we mimic the heuristic approach proposed for piecewise constants on rectangles in §4. 1 by assuming that on each 
triangle T the relative variation of d"' f is small so that it can be considered as a constant over T. This means that 
/ is locally identified with its Taylor polynomial of degree m at z, which is defined as 

qziz') ■■= m + V/(z) ■ (z' - Z) + £ [Z' - Z, • • ■ ,Z' - Z] . 

If ?e Pm is a polynomial of degree m, we denote by q G H™ its homogeneous part of degree m. For q — we 
can identify q^ e Hm with -^d'" f{z). Since q — g £ Pm-l we have 

em,T{q)p = «'m,7-(q)p- 

We optimize the shape of the simplex T with respect to q by introducing the function Km,p defined on the space 

Km,p{q) ■■= mf^em.T{(l)p, (5.76) 

where the infimum is taken among all triangles of area 1 . This infimum may or may not be attained. We refer to 
Kin,p as the shape function. It is obviously a generalization of the function Kp introduced for piecewise constant 
on rectangles in §4. 1 . 

As in the case of rectangles, some elementary properties of Km.p are obtained by change of variable: if a + T 
is a shifted version of T, then 

em,a+T{(l)p^e,„j{q)p (5.77) 
since q and q(- — a) differ by a polynomial of degree m — 1, and that if hT is a dilation of T, then 

emMT{q)p = h^'"+"'em.T{q)p (5.78) 

Therefore, if T is a minimizing simplex in ( |5.76^ , then a + T is also one, and if we are interested in minimizing 
the error for a given area |r| = A, we find that 

inf e,„,T{q)p=A^/''K,„,p{q), - := - + ^ (5.79) 

\T\=A ' T p d 

and the minimizing simplex for ( |4.58| l are obtained by rescaling the minimizing simplex for j4.55| l. 
Remarking in addition that if (p is an invertible linear transform, we then have for all / 

|det((p)|l/'^e„,T(,^o<P)p = em.9(7-)(/)p> 

and using ( |5.79^ , we also obtain that 

K,„,p{qo(p) = |det((p)|'"/f,„,^(q) (5.80) 

The minimizing simplex of area 1 for qo (p is obtained by application of (p^' followed by a rescaling by 
|det((p)| 'Z*^ to the minimizing simplex of area 1 for q if it exists. 
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5.2 Algebraic expressions of the shape function 

The identity l |5.80[ > can be used to derive the explicit expression of A^m.p for particular values of {m,p,d), as well 
as the exact shape of the minimizing triangle T in l |5.76[ ). 

We first consider the case of piecewise affine elements on two dimensional triangulations, which corresponds 
to d = m = 2. Here q is a quadratic form and we denote by det(q) its determinant. We also denote by |q| the 
positive quadratic form associated with the absolute value of the symmetric matrix associated to q. 

If det(q) 7^ 0, there exists a (p such that q o (p is either + or — y^, up to a sign change, and we have 



|det(q)| = |det(9)| ^. It follows from ( |5.80| that K2,p{q) has the simple form 

if2,p(q) = K-p|det(q)|l/2, (5.81) 

where Kp := K2,p{x^+y^) if det(q) > and Kp = K2.p{x^ -y^) if det(q) < 0. 

The triangle of area 1 that minimizes the LP error when q = + is the equilateral triangle, which is 
unique up to rotations. For q = —y^^ the triangle that minimizes the error is unique up to an hyperbolic 
transformation with eigenvalues t and 1/f and eigenvectors (1,1) and (1,-1) for any / / 0. Therefore, such 
triangles may be highly anisotropic, but at least one of them is isotropic. For example, it can be checked that a 
triangle of area 1 that minimizes the L°° error is given by the half square with vertices ((0,0), (\/2,0), (0, \/2)). It 
can also be checked that an equilateral triangle T of area 1 is a "near-minimizer" in the sense that 

e2.7-(q)p <CA:2,p(q), 

where C is a constant independent of p. It follows that when det(q) / 0, the triangles which are isotropic with 
respect to the distorted metric induced by |q| are "optimally adapted" to q in the sense that they nearly minimize 
the V error among all triangles of similar area. 

In the case when det(q) = 0, which corresponds to one-dimensional quadratic forms q = [ax + by)^, the 
minimum in ( |5.76^ is not attained and the minimizing triangles become infinitely long along the null cone of q. 



In that case one has A^2,/j(q) = and the equality 1 5.81 



remains therefore valid. 



These results easily generalize to piecewise affine functions on simplicial partitions in higher dimension d> \ : 
one obtains 

/f2,;,(q) = K-p|det(q)|l/'', (5.82) 

where Kp only takes a finite number of possible values. When det(q) / 0, the simplices which are isotropic with 
respect to the distorted metric induced by |q| are "optimally adapted" to q in the sense that they nearly minimize 
the If error among all simplices of similar volume. 

The analysis becomes more delicate for higher polynomial degree m > 3. For piecewise quadratic elements 
in dimension two, which corresponds to m = 3 and = 2, it is proved in f39] that 

K,,p{q) = Kp\disc{q)\''\ 

for any homogeneous polynomial q e II3, where 

disc(ax^ + bx^y + cxy^ + dy^) := b^c^ - Aac^ - Ab^d + ISabcd - 27a^d^, 

is the usual discriminant and Kp only takes two values depending on the sign of disc(q). The analysis that leads 
to this result also describes the shape of the triangles which are optimally adapted to q. 

For other values of m and d, the exact expression of Km^piq) is unknown, but it is possible to give equivalent 
versions in terms of polynomials Q„, j in the coefficients of q, in the following sense: for all q e Hn, 

Cl {Qm.d{<l))'- < < C2{Q„A(l))' , 

where r := deg{Q„,j), see l39l . 

Remark 5.1 It is easily checked that the shape functions q i-> ^Gn,p(q) we equivalent for all p p in the sense that 
there exist constant < Ci < C2 that only depend on the dimension d such that 

Ci/fm^c»(q) < K„,p(q) < C2K,„,„{q), 

for all q e H^n and p> 1. In particular a minimizing triangle for Km.oa is a near-minimizing triangle for K„,p. In 
that sense, the optimal shape of the element does not strongly depend on p. 
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5.3 Error estimates 



Following at first a similar heuristics as in §4. 1 for piecewise constants on rectangles, we assume that the trian- 
gulation ^ is such that all its triangles T have optimized shape with respect to the polynomial q that coincides 
with / on r. 

According to ( |5.79^ , we thus have for any triangle T e 3^, 



em.T{f)p = \T\'' K„,,p{q) = 



L-'iT) 



We then apply the principle of error equidistribution, assuming that 

emj{f)p = ri, 
From which it follows that £,,,..55^ {f)p <N^^Pri and 



Km 



d"'f 



and therefore 



ON{f)p<N-"'/' 



K, 



d"'f 



(5.83) 



This estimate should be compared to j3.38[ l which was obtained for adaptive partitions with elements of isotropic 



shape. The essential difference is in the quantity A',„ ^ 



which replaces d"'f in the norm, and which may 



be significantly smaller. Consider for example the case of piecewise affine elements, for which we can combine 
l|5.83|l with l|5.82|l to obtain 

(5.84) 



(T^(/)p<CW-2A'|||det(^V)l'/''|L,. 



In comparison to 1 3.38 1, the norm of the hessian \d^f\ is replaced by the quantity |det(t/^/)| which is geometric 
mean of its eigenvalues, a quantity which is significantly smaller when two eigenvalues have different orders of 
magnitude which reflects an anisotropic behaviour in /. 

As in the case of piecewise constants on rectangles, the example of a function / depending on only one 
variable shows that the estimate l |5.84[ l cannot hold as such. We may obtain some valid estimates by following 
the same approach as in Theorem |4.2[ This leads to the following result which is established in |39|. 

Theorem 5.2 For piecewise polynomial approximation on adaptive anisotropic partitions into simplices, we have 



limsup A''"/''oiv(/)p <C 



Km 



d'"f 



1 

P 



m 



(5.85) 



for all f e C"'{Q.). The constant C can be chosen equal to 1 in the case of two-dimensional triangulations d = 2. 

The proof of this theorem follows exactly the same line as the one of Theorem |4. 2 1 we build a sequence of 
partitions by refining the triangles 5 of a sufficiently fine quasi-uniform partition , intersecting each 5 with 
a partition 5^, 5 by elements with shape optimally adapted to the local value of d"'f on each S. The constant C 
can be chosen equal to 1 in the two-dimensional case, due to the fact that it is then possible to build ,% s as a 
tiling of triangles which are all optimally adapted. This is no longer possible in higher dimension, which explains 
the presence of a constant C = C{m,d) larger than 1. 

We may also obtain lower estimates, following the same approach as in Theorem |4.3[ we first impose a 
slight restriction on the set .g^5v of admissible partitions, assuming that the diameter of the elements decreases as 
A' +00, according to 

maxliT <AN'^/'', (5.86) 

for some A > which may be arbitrarily large. We then obtain the following result, which proof is similar to the 
one of Theorem 14. 3 1 



Theorem 5.3 Under the restriction l|5.86b, we have 



liminf N'"^''(JN(f)n > 



K, 



d"'f 



(5.87) 



forall f eC'"{Q.), where \ 



p^ d- 
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5.4 Anisotropic smoothness and cartoon functions 



Theorem|5.2|reveals an improvement over the approximation results based on adaptive isotropic partitions in the 



sense that \\K„, 



\ii may be significantly smaller than ||rf'"/||£T, for functions which have an anisotropic 
behaviour However, this result suffers from two major defects: 

1. The estimate ( |5.85^ is asymptotic: it says that for all e > 0, there exists A'q depending on / and e such that 



Km 



d"'f 



for all N >Nq. However, it does not ensure a uniform bound on A'o which may be very large for certain /. 



2. Theorem 5.2 is based on the assumption / e C"'(fl), and therefore the estimate (5.85 ' only seems to apply 
to sufficiently smooth functions. This is in contrast to the estimates that we have obtained for adaptive 
isotropic partitions, which are based on the assumption that / e W"''^(fl) or / e B"\{0.). 

The first defect is due to the fact that a certain amount of refinement should be performed before the relative 
variation of d"/ is sufficiently small so that there is no ambiguity in defining the optimal shape of the simplices. 
It is in that sense unavoidable. 

The second defect raises a legitimate question concerning the validity of the convergence estimate l |5.85[ ) for 
functions which are not in C"'{Q). It suggests in particular to introduce a class of distributions such that 



K, 



m.p 



and to try to understand if the estimate remains valid inside this class which describe in some sense functions 
which have a certain amount anisotropic smoothness. The main difficulty is that that this class is not well defined 

As an example consider the case of piecewise linear elements on two 



due to the nonlinear nature of K„i p ' 



dimensional triangulation, that corresponds to m = d = 2. In this case, we have seen that A'2^p(q) 
The numerical quantity that governs the approximation rate A'^ ' is thus 



Kpy/\det{q)\. 



Mf) 



|det(rf2/)l 



1 

P 



1. 



However, this quantity cannot be defined in the distribution sense since the product of two distributions is gen- 
erally ill-defined. On the other hand, it is known that the rate A'^ ' can be achieved for functions which do not 
have smoothness, and which may even be discontinuous along curved edges. Specifically, we say that / is a 
cartoon function on SI if it is almost everywhere of the form 

l<i<k 

where the O,- are disjoint open sets with piecewise boundary, no cusps (i.e. satisfying an interior and exterior 
cone condition), and such that SI = U*^[ii,-, and where for each 1 < ; < A:, the function fi is on a neighbourhood 
of fl,. Such functions are a natural candidates to represent images with sharp edges or solutions of PDE's with 
shock profiles. 

Let us consider a fixed cartoon function / on a polygonal domain Si associated with a partition (fl,)i<,<^. 
We define 

r:= U ^^>^ 

\<i<k 

the union of the boundaries of the Slj. The above definition implies that T is the disjoint union of a finite set of 
points 3^ and a finite number of open curves (r,)i<,</. 



r= U r, 

\<i<i 



If we consider the approximation of / by piecewise affine function on a triangulation ,9m of cardinality A', we 
may distinguish two types of elements of A triangle T e is called "regular" if T n F = 0, and we denote 
the set of such triangles by ,5^. Other triangles are called "edgy" and their set is denoted by ,9'^. We can thus 
split SI according to 
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We split accordingly the W approximation error into 

ei..%(f)';,= I e2.T{f)P+ X e2,r(/)^. 

We may use ff{N) triangles in 5^ and ,5^ (for example N /2 in each set). Since / has discontinuities along F, the 
approximation error on the edgy triangles does not tend to zero in L°° and 5^ should be chosen so that has 
the aspect of a thin layer around F. Since F is a finite union of curves, we can build this layer of width CiN^^) 
and therefore of global area |£1^| < CN^^, by choosing long and thin triangles in J^. On the other hand, since 
/ is uniformly on Q.'^, we may choose all triangles in ^ of regular shape and diameter hx < CN^^I^ . Hence 
we obtain the following heuristic error estimate, for a well designed anisotropic triangulation: 



< c|a;,|(supre5-4)llrfVll^:~(n;^) +c|n: 



and therefore 

e2..%{f)p<CN-"^-^'-^IP\ (5.88) 

where the constant C depends on I|rf^/tli«(n\r)' II/IIl-(£2) the number, length and maximal curvature of 

the curves which constitute F. 

These heuristic estimates have been discussed in |38| and rigorously proved in 1251 . Observe in particular 
that the error is dominated by the edge contribution when p > 2 and by the smooth contribution when p <2. For 
the critical value p = 2 the two contributions have the same order. 

For p > 2, we obtain the approximation rate A^^ ' which suggests that approximation results such as Theorem 
should also apply to cartoon functions and that the quantity Ap{f) should be finite for such functions. In 



5.2 



some sense, we want to "bridge the gap" between results of anisotropic piecewise polynomial approximation for 
cartoon functions and for smooth functions. For this purpose, we first need to give a proper meaning to Ap{f) 
when / is a cartoon function. As already explained, this is not straightforward, due to the fact that the product 
of two distributions has no meaning in general. Therefore, we cannot define det(d^/) in the distribution sense, 
when the coefficients of cfif are distributions without sufficient smoothness. 

We describe a solution to this problem proposed in |22| which is based on a regularization process. In the 
following, we consider a fixed radial nonnegative function (p of unit integral and supported in the unit ball, and 
define for all 5 > and / defined on £1, 

(Ps{z)---L(p(^] and fs=f*(p5. (5.89) 



It is then possible to gives a meaning to A p{f) based on this regularization. This approach is additionally justified 
by the fact that sharp curves of discontinuity are a mathematical idealisation. In real world applications, such as 
photography, several physical limitations (depth of field, optical blurring) impose a certain level of blur on the 
edges. 

If / is a cartoon function on a set £1, and if x e F \ we denote by [/] (x) the jump of / at this point. We also 
denote by \ k{x)\ the absolute value of the curvature at x. For p e [ 1 > H and T defined by ^ := l + ^,we introduce 
the two quantities 



^|det(d2/)l 



Spif) - 



Epif) ■■= \\V\'^\[f\\\L'ir] 



= Mf\n\r) 

u{n\r) 



which respectively measure the "smooth part" and the "edge part" of /. We also introduce the constant 

Cp,p:=||Vl**^lb(M), *W:= / 9{^,y)dy. (5.90) 

Note that fg is only properly defined on the set 

■-{zeO.;B{z,5)ca}, 

and therefore, we define Ap{fg) as the norm of \/| det{d^fs)\ on this set. The following result is proved in 
L221. 

Theorem 5.4 For all cartoon functions f, the quantity Ap [fg ) behaves as follows: 
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• Ifp < 2, then 

VimAp{fs) = Sp{f). 

• If p = 2, then T = | and 

limA2(/5) = {S2(fY+E2{fyCl^)'l\ 

• If p > 2, then Ap {fg ) — > according to 

Vim s'^'^'Apifs) = Ep{f)Cp,^. 

Remark 5.5 r/j/^ theorem reveals that as 8 ^ Q, the contribution of the neighbourhood ofT to Ap{fg) is ne- 
glectible when p <2 and dominant when p>2, which was already remarked in the heuristic computation leading 
to§M^. 

Remark 5.6 In the case p = 2, it is interesting to compare the limit expression {SiifY + ^2(/)^C|^)'''^ with 
the total variation TV(f) = |/|bv- For a cartoon function, the total variation also can be split into a contribution 
of the smooth part and a contribution of the edge, according to 

TV{f):= [ IV/I+ / |[/]|. 
Jn\r Jr 

Functions of bounded variation are thus allowed to have jump discontinuities along edges of finite length. For this 
reason, BV is frequently used as a natural smoothness space to describe the mathematical properties of images. It 
is also well known that BV is a regularity space for certain hyperbolic conservation law, in the sense that the total 
variation of their solutions remains finite for all time f > 0. In recent years, it has been observed that the space 
BV ( and more generally classical smoothness spaces) do not provide a fully satisfactory description ofpiecewise 
smooth functions arising in the above mentionned applications, in the sense that the total variation only takes 
into account the size of the sets of discontinuities and not their geometric smoothness. In contrast, we observe 
that the term E2 (/) incorporates an information on the smoothness of T through the presence of the curvature 
\k\. The quantity Aiif) appears therefore as a potential substitute to TV{f) in order to take into account the 
geometric smoothness of the edges in cartoon function and images. 



6 Anisotropic greedy refinement algorithms 

In the two previous sections, we have estabhshed error estimates in norms for the approximation of a function 
/ by piecewise polynomials on optimally adapted anisotropic partitions. Our analysis reveals that the optimal 
partition needs to satisfy two intuitively desirable features: 

1 . Equidistribution of the local error. 

2. Optimal shape adaptation of each element based on the local properties of /. 

For instance, in the case of piecewise affine approximation on triangulations, these items mean that each triangle 
T should be close to equilateral with respect to a distorted metric induced by the local value of the hessian d^f. 

From the computational viewpoint, a commonly used strategy for designing an optimal triangulation consists 
therefore in evaluating the hessian d-f and imposing that each triangle is isotropic with respect to a metric which 
is properly related to its local value. We refer in particular to 1 10| and to |9| where this program is executed 
by different approaches, both based on Delaunay mesh generation techniques (see also the software package 
1451 which includes this type of mesh generator). While these algorithms produce anisotropic meshes which are 
naturally adapted to the approximated function, they suffer from two intrinsic limitations: 

1. They are based on the data of d^f, and therefore do not apply well to non-smooth or noisy functions. 

2. They are non-hierarchical: for N > M, the triangulation ^ is not a refinement of 

Similar remark apply to anisotropic mesh generation techniques in higher dimensions or for finite elements of 
higher degree. 

The need for hierarchical partitions is critical in the construction of wavelet bases, which play an important 
role in applications to image and terrain data processing, in particular data compression |19| . In such applications, 
the multilevel structure is also of key use for the fast encoding of the information. Hierarchy is also useful in the 
design of optimally converging adaptive methods for PDE's |8ll40ll43l . However, all these developments are so 
far mostly limited to isotropic refinement methods, in the spirit of the refinement procedures discussed in §3. Let 
us mention that hierarchical and anisotropic triangulations have been investigated in t36J , yet in this work the 
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Figure 5: Anisotropic partitions obtained by rectangle split (left) and triangle bisection (right) 



triangulations are fixed in advance and therefore generally not adapted to the approximated function. 

A natural objective is therefore to design adaptive algorithmic techniques that combine hierarchy and anisotropy, 
that apply to anyfimction f e V'{Sl), and that lead to optimally adapted partitions. 

In this section, we discuss anisotropic refinement algorithms which fuUfill this objective. These algorithms have 
been introduced and studied in 1201 for piecewise polynomial approximation on two-dimensional triangulations. 
In the particular case of piecewise affine elements, it was proved in \2 \ \ that they lead to optimal error estimates. 
The main idea is again to refine the element T that maximizes the local error e,„,T{f)p, but to allow several 
scenarios of refinement for this element. Here are two typical instances in two dimensions: 

1. For rectangular partitions, we allow to split each rectangle into two rectangles of equal size by either a 
vertical or horizontal cut. There are therefore two splitting scenarios. 

2. For triangular partitions, we allow to bisect each triangle from one of its vertex towards the mid-point of 
the opposite edge. There are therefore three splitting scenarios. 

We display on Figure|5]two examples of anisotropic partitions respectively obtained by such splitting techniques. 
The choice between the different splitting scenarios is done by a decision rule which depends on the function /. 
A typical decision rule is to select the split which best decreases the local error. The greedy refinement algorithm 
therefore reads as follows: 

1. Initialization: 5iv„ = % withWo :=#{%). 

2. Given ^ select T e that maximizes e„ j{f)j. 

3. Use the decision rule in order to select the type of split to be performed on T . 

4. Split T into K elements to obtain 3^i^+k-\ ^ind return to step 2. 

Intuitively, the error equidistribution is ensured by selecting the element that maximizes the local error, while the 
role of the decision rule is to optimize the shape of the generated elements. 

The problem is now to understand if the piecewise polynomial approximations generated by such refinement 
algorithms satisfy similar convergence properties as those which were established in §4 and §5 when using op- 
timally adapted partitions. We first study the anisotropic refinement algorithm for the simple case of piecewise 
constant on rectangles, and we give a complete proof of its optimal convergence properties. We then present 
the anisotropic refinement algorithm for piecewise polynomials on triangulations, and give without proof the 
available results on its optimal convergence properties. 

Remark 6.1 Let us remark that in contrast to the refinement algorithm discussed in §2.5 and 3.3, the partition 
S/'j^ may not anymore be identified to a finite subtree within a fixed infinite master tree ./#. Instead, for each f, 
the decision rule defines an infinite master tree ^{f) that depends on f. The refinement algorithm corresponds 
to selecting a finite subtree within ./#(/). Due to the finite number of splitting possibilities for each element, this 
finite subtree may again be encoded by a number of bits proportional to N. Similar to the isotropic refinement 
algorithm, one may use more sophisticated techniques such as CART in order to select an optimal partition of 
N elements within ^(/). On the other hand the selection of the optimal partition within all possible splitting 
scenarios is generally of high combinatorial complexity. 

Remark 6.2 A closely related algorithm was introduced in f26i and studied in H24\l . In this algorithm every 
element is a convex polygon which may be split into two convex polygons by an arbitrary line cut, allowing 
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therefore an infinite number of splitting scenarios. The selected split is again typically the one that decreases 
most the local error. Although this approach gives access to more possibilities of anisotropic partitions, the 
analysis of its convergence rate is still an open problem. 



6.1 The refinement algorithm for piece wise constants on rectangles 

As in §4, we work on the square domain Q. = [0, 1]^ and we consider piecewise constant approximation on 
anisotropic rectangles. At a given stage of the refinement algorithm, the rectangle T — I x J that maximizes 
e\ j(f)p is split either vertically or horizontally, which respectively corresponds to split one interval among / 
and J into two intervals of equal size and leaving the other interval unchanged. As already mentionned in the 
case of the refinement algorithm discussed in §3.3, we may replace e\ j{f)p by the more computable quantity 
II/ — Pi 7-/[|p for selecting the rectangle T of largest local error. Note that the L^(r)-projection onto constant 
functions is simply the average of / on T: 



If T is the rectangle that is selected for being split, we denote by {Tj,Ti,) the down and up rectangles which are 
obtained by a horizontal split of T and by (T/ , Tr) the left and right rectangles which are obtained by a vertical split 
of T. The most natural decision rule for selecting the type of split to be performed on T is based on comparing 
the two quantities 

eT.h{f)p-= {ei.TAf)f, + ei.T,Xf)f^''' and := (^IT, (/)^ + 

which represent the local approximation error after splitting T horizontally or vertically, with the standard modi- 
fication when p = oo. The decision rule based on the LP error is therefore : 

V^T,h{f)p ^ ^T,v{f)p- then T is split horizontally, otherwise T is split vertically. 

As already explained, the role of the decision rule is to optimize the shape of the generated elements. We have 
seen in §4. 1 that in the case where / is an affine function 

i{^,y) = qQ + q.^x+qyy, 

the shape of a rectangle T = 1 x J which is optimally adapted to q is given by the relation ( |4.59[ l. This relation 
cannot be exactly fuUfiUed by the rectangles generated by the refinement algorithm since they are by construction 
dyadic type, and in particular 

for some j £ 7L. We can measure the adaptation of T with respect to q by the quantity 



aq{T) : 



log 



\J\ kyl 



(6.91) 



which is equal to for optimally adapted rectangles and is small for "well adapted" rectangles. Inspection of the 
arguments leading the heuristic error estimate j4.65[> in §4. 1 or to the more rigourous estimate l|4.68[) in Theorem 



|4.2| reveals that these estimates also hold up to a fixed multiplicative constant if we use rectangles which have 
well adapted shape in the sense that a^j. (T) is uniformly bounded where qr is the approximate value of / on T. 

We notice that for all q such that q^qy ^ 0, there exists at least a dyadic rectangle T such that aT{q) < j. 
We may therefore hope that the refinement algorithm leads to optimal error estimate of a similar form as ( |4.68^ , 
provided that the decision rule tends to generate well adapted rectangles. The following result shows that this is 
indeed the case when / is exactly an affine function, and when using the decision rule either based on the or 
L°° error. 

Proposition 6.3 Let q Pi be an affine function and let T be a rectangle. IfT is split according to the decision 
rule either based on the I?- or L°° error for this function and ifT' a child ofT obtained from this splitting, one 
then has 

aq{T')<\aq{T)-\\. (6.92) 
As a consequence, all rectangles obtained after sufficiently many refinements satisfy aq[T) < 1. 
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Proof: We first observe that if T = / x 7, the local L°° error is given by 

ei,r(?)~ := ^max{|9.v| \I\,\qy\ \J\}, 



and the local error is given by 



Assume that T is such that |/| Ig.i l > I'^l kyl- In such a case, we find that 

erAl)- = ^max{|fe| \I\,\qy\ \J\/2} = \q,\ \l\/2, 



and 



eT.h{q)oo = ^max{|^,| \I\/2,\qy\ \J\} < \q,\ \l\/2. 



Therefore eji,{q)oo < er.i'('/)oo which shows that the horizontal cut is selected by the decision rule based on the 
L°° error. We also find that 

erA^)! :=^fe?|/|' + ??l-/|V4)'/^ 



and 



eTM{q)2--=^{ql\I\V4 + qj\J\^y/\ 



and therefore ejj,(q)2 < er,i'(9)2 which shows that the horizontal cut is selected by the decision rule based on 
the L? error. Using the fact that 

/ |/| \q 



we find that if T' is any of the two rectangle generated by both decision rules, we have aq{T') = aq{T) — 1 if 
aq{T) > 1 and aq{T') = I — aq{T) if aq{T) < 1. In the case where |/| | < \J\ l^yl- we reach a similar conclusion 
observing that the vertical cut is selected by both decision rules. This proves l |6.92[ ) □ 

Remark 6.4 We expect that the above result also holds for the decision rules based on the V error for p ^ {2, oo} 
which therefore also lead to well adapted rectangles when f is an afftne. In this sense all decision rules are 
equivalent, and it is reasonable to use the simplest rules based on the L? or L°° error in the refinement algorithm 
that selects the rectangle which maximizes e\j{f)p, even when p differs from 2 or <x>. 

6.2 Convergence of the algorithm 

From an intuitive point of view, we expect that when we apply the refinement algorithm to an arbitrary function 
/ G (fl), the rectangles tend to adopt a locally well adapted shape, provided that the algorithm reaches a stage 
where / is sufficiently close to an affine function on each rectangle. However this may not necessarily happen 
due to the fact that we are not ensured that the diameter of all the elements tend to as A' ^ oo. Note that this is 
not ensured either for greedy refinement algorithms based on isotropic elements. However, we have used in the 



proof of Theorem 3.10 the fact that for A' large enough, a fixed portion - say N/2-of the elements have arbitrarily 
small diameter, which is not anymore guaranteed in the anisotropic setting. 

We can actually give a very simple example of a smooth function / for which the approximation produced by 
the anisotropic greedy refinement algorithm /a/Zi to converge towards / due to this problem. Let (jf) be a smooth 
function of one variable which is compactly supported on ]0, 1 [ and positive. We then define / on [0, 1]^ by 

f{x,y) :=(p(4x)-(p(4x-l). 

This function is supported in [0, 1/2] x [0, 1]. Due to its particular structure, we find that if T = [0, 1]^, the best 
approximation in U'iT) is achieved by the constant c = and one has 

e,j{f)p = 2'IP\Mu,. 

We also find that c = is the best approximation on the four subrectangles Tj, r„, T/ and Ty and that ej i,{f)p = 
^T,.{f)p = eij{f)p which means both horizontal and vertical split do not reduce the error. According to the 
decision rule, the horizontal split is selected. We are then facing a similar situation on Tj and r„ which are again 
both split horizontally. Therefore, after A' — 1 greedy refinement steps, the partition consists of rectangles all 
of the form [0, 1] x J where J are dyadic intervals, and the best approximation remains c = on each of these 
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rectangles. This shows that the approximation produced by the algorithm fails to converge towards /, and the 
global error remains 

foralliV>0. 

The above example illustrates the fact that the anisotropic greedy refinement algorithm may be defeated by 
simple functions that exhibit an oscillatory behaviour. One way to correct this defect is to impose that the refine- 
ment ofT = 1 xj reduces its largest side-length the case where the refinement suggested by the original decision 
rule does not sufficiently reduce the local error. This means that we modify as follow the decision rule: 

Case 1: if mm{eT^h{f)p^eT,v{f)p} < P£i,r(/)p> then T is split horizontally if ej^hiDp eT^v{f)p or verti- 
cally if eTfiif)p > eT,v(f)p- We call this a greedy split. 

Case 2: if min{ej^(/)p,e7- v(/)p} > P«i,r(/)p. then T is split horizontally if |/| < |7| or vertically if |/| > |/|. 
We call this a safety split. 

Here p is a parameter chosen in ]0, 1[. It should not be chosen too small in order to avoid that all splits are 
of safety type which would then lead to isotropic partitions. Our next result shows that the approximation pro- 
duced by the modified algorithm does converge towards /. 

Theorem 6.5 For any f e LP {SI) or in C{Sl) in the case p = oo, the partitions S^i^ produced by the modified 
greedy refinement algorithm with parameter p e]0, 1 [ satisfy 

lim ei^,^^{f)p = Q. (6.93) 

Proof: Similar to the original refinement procedure, the modified one defines a infinite master tree ^ := ^{f) 
with root Q. which contains all elements that can be generated at some stage of the algorithm apphed to /. This 

tree depends on /, and the partition fJ^r produced by the modified greedy refinement algorithm may be identified 
to a finite subtree within J<({f ). We denote by &j := &j{f ) the partition consisting of the rectangles of area 2"-' 
in which are thus obtained by j refinements of CI. This partition also depends on /. 

We first prove that ei^^.{f)p — > as j ^> oo. For this purpose we split &j into two sets &j and &j. The 

first set 2^ J consists of the element T for which more than half of the splits that led from ilto T were of greedy 
type. Due to the fact that such sphts reduce the local approximation error by a factor p and that this error is not 
increased by a safety spUt, it is easily cheched by an induction argument that 

eiMp = ( E er^TifrpV^' < p''^e,M < P''^\\f\W^ 

which goes to as y — )■ +oo. This result also holds when p = ca. The second set consists of the elements T for 
which at least half of the splits that led from H to T were safety spUt. Since two safety spUts reduce at least by 2 
the diameter of T, we thus have 

max hr < 2^-'!'^, 

which goes to as j From classical properties of density of piecewise constant functions in LP spaces 

and in the space of continuous functions, it follows that 

ei,^?(/)p->0 as 7-)--foo. 

This proves that 

ei.&,{f)p = + ^ as ^ +oo, 

with the standard modification if p = co. 

In order to prove that e\^^^^{f)p also converges to 0, we first observe that since ei^e^.{f)p — > 0, it follows that 
for all £ > 0, there exists only a finite number of T e .^{f) such that £17- {f)p > £. In turn, we find that 

e{N) := max ei T{f)D as Af +00. 

For some j > 0, we split ,95v into two sets ^j^'^ and which consist of those T G which are in &i for 
/ > j and / < j respectively. We thus have 

eiMf)p = {^^-ifyp^^^^-ify^"' ^ (ei.^,(/)^ + 2^£(iVf j'^''. 

Since ey^^. {f)p ^> as j ^ +00 and £{N) -> as iV -> 00, and since j is arbitrary, this concludes the proof , with 
the standard modification if p = 00. □ 
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6.3 Optimal convergence 

We now prove that using the specific value P = ^ the modified greedy refinement algorithm has optimal con- 
vergence properties similar to l |4.68^ in the case where we measure the error in the L°° norm. Similar results can 
be obtained when the error is measured in L'' with p < oo, at the price of more technicalities. 

Theorem 6.6 There exists a constant C > such that for any f & (fl), the partition 55v produced by the 
modified greedy refinement algorithm with parameter P ~ ^ satisfy the asymptotic convergence estimate 



(6.94) 



hmsup < C J\djd,f\ 

The proof of this theorem requires a preliminary result. Here and after, we use the t° norm on r2 for measurmg 
the gradient: for z = (x,y) £ £1 

|V/(z)|:=max{|5,/(z)|,|^„/(z)|}, 

and 

||V/||^„(^) :=sup|V/(z)| =max{l|5,/l|^„(r),l|^y/lk-(r)}. 

zeT 

We recall that the local L°°-error on T is given by 

e\.T(f)^ = \ (max/(z)-min/(z) 

For the sake of simplicity we define 

er (/) := max/(z) - min/(z) = 2eij{f)^, 
zeT zsT 

and 

<^T.hf ■= 2er,/,(/)=o, e7-,v(/) := 2eT,v{f )oo. 
We also recall from the proof of Theorem |6.5| that 

e(A') := max erif) ^ as iV ^ +0°. 

Te.9N 

Finally we sometimes use the notation x{z) and y{z) to denote the coordinates of a point z e R^. 

Lemma 6.7 Let Tq = Iq x Jq S^m be a dyadic rectangle obtained at some stage M of the refinement algorithm, 
and let T = I X J (E be a dyadic rectangle obtained at some later stage N > M and such that T C Tq. We then 
have 

\I\ > mini |/o|. \ and \J\ > mini '^^'^^ 



4||V/||^.(7.„) J ' [' ""4||V/||^.(j-„) 

Proof: Since the coordinates x and y play symmetrical roles, it suffices to prove the first inequality. We reason by 
contradiction. If the inequality does not hold, there exists a rectangle T' = I' x f in the chain that led from Tq to 
Ti which is such that 

|/'l< ^(^^ 



2||V/||z,-.(7i)' 

and such that T' is split vertically by the algorithm. If this was a safety split, we would have that |y'| < |/'| and 
therefore 

erif) < (|/'l + |/|)l|V/lL„(r) < 2|/'|t|V/t|^.(^) < EiN), 

which is a contradiction, since all ancestors of T should satisfy ex'if) > e{N). Hence this split was necessarily a 
greedy split. 

Let Zm '■= Argmin,g7-//(z) and zm Argmax-g7-;/(z), and let T" be the child of T' (after the vertical split) 
containing zm- Then T" also contains a point zj„ such that \x{z'„j) —x{zm)\ < andy(zl„) = y{zm)- It follows 
that 

>f{ZM)-f{^J 

>f{zM)-f{Zm)-\\dJ\\L-(T')\l'\/2 

>er{f)-e{N)/4 

> hrif) 

> per if). 
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The error was therefore insufficiently reduced which contradicts a greedy spMt. □ 



Proof of Theorem 6.6 We consider a small but fixed 5 > 0, we define h{5) as the maximal h> such that 

Vz,z' e fl, \z-z\ < 2/7(5) ^ |V/(z) - V/(z')| < 5. 
For any rectangle T = / x 7 C fl, we thus have 

erif) > mfU^^T)-S)mm{h{5),\I\}, 

erif) > {\\d,fU^^r)-S)mm{h{5),\J\}. ^^'^"^ 

Let 5 > and M = M(/, 5) be the smallest value of N such that e(A') < 9Sh{S). For all N>M, and therefore 
e(A') < 95/i(5), we consider the partition which is a refinement of For any rectangle Tq = /q xJ^E .^m, 
we denote by ,%i{Tt)) the set of rectangles of 5]v that are contained Tq. We thus have 

'■= ^TaeSu^NiTo), 

and 3^n{To) is a partition of Tq. We shall next bound by below the side length ofT = I x J contained in ,%i{T(j), 
distinguishing different cases depending on the behaviour of / on Tq- 



Case 1. If T{) e 5^ is such that || V/H^^j-j-^j) < 105, then a direct application of Lemma 
T=lxJ e ..^(ro) we have 



6.7 



shows that for all 



|/|>min||/o|,^| and|y|>min||yo|,^} (6.96) 

Case 2. If Tq e is such that ||<5v/||£~(jjjj > 105 and ||i5v/||l"(7-„) > 105, we then claim that for all T = / x 7 G 
^n{To) we have 

|/|>min<!|/o|,— 1 and|7|>mm||7o|,^^^r^^)f^ }, (6.97) 



and that furthermore 

2 



\To\ ll<5x/||L~(ro)ll'5v/llL»(r„)< (y) JjdJ dyf\dxdy. 



(6.98) 



This last statement easily follows by the following observation: combining 1 6.95 i with the fact that l|'^x/llL"(ro) — 
105 and ||(3v/||l-(7„) > 105 and that erif) < £{N) < 95h{d), we find that Wall z £ Tq 

\dj{z)\ > \\dJU-i^r,) - S > ^\\dJh-iTo), 

and 

Integrating over Tq yields l |6.98^ . Moreover for any rectangle T C Tq, we have 

To - ll'?../llL-(r„)|/| + l|5v/llL-(r„)l-/| - ^' ^^•'^'^^ 

Clearly the two inequalities in ( |6.97^ are symmetrical, and it suffices to prove the first one. Similar to the proof of 
Lemma 6.7 we reason by contradiction, assuming that a rectangle T' = l' x f with |/'| ||5v/||£«(j-uj < ^^j^ was 
split vertically by the algorithm in the chain leading from Tq^oT. A simple computation using inequality l |6.99| l 
shows that 

er,h{f) ^^r.hLf) ^5 1 + 2CT .^^ ll'5x/||L-(r„)|/'l 
< — < - X — with (7 :— 



«r(/) - er, ,.(/)- 9 l + cT/2 ■ \\d,fh.(T,)\J'\' 

In particular if CT < 0.2 the algorithm performs a horizontal greedy split on T' , which contradicts our assumption. 
Hence a > 0.2, but this also leads to a contradiction since 

<er(/)< ll^./llL-(7-„)|/'l + IRv/||L-(7i)l-/'l < (l + <T-i)^ <e(iV) 
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Case 3. If To £ be such that ||5.v-/||l~(7o) — II^v/||l~(7'o) — 105, we then claim that for all T = 

IxJe ^n{To) we have 

|/|>min||/o|,^| and|/|>min||7o|,^||^}, withC = 200, (6.100) 



with symmetrical result if Tq is such that ||i9v/||L-'(ro) — ll'^v/llL~(ro) — 105. The second part of (6.100 

is a direct consequence of Lemma [6^ hence we focus on the first part. Applyting the second inequality ot 
to r = Tq, we obtain 

9Sh{S) > ej.if) > {Hf\\L^{To)-5)mm{h{S)M}>9Smm{h{S),\Jo\}, 

from which we infer that l^ol <h{S). If zi ,Z2 £ Tq andx(zi) =x{z2) we therefore have |<5v/(zi)| > |i5v/(z2)l ^ ^■ 
It follows that for any rectangle T = I x J C Tq we have 

mf\\L-(T) - S)\J\ < erif) < Hfh^^T)\J\ + 105|/|. (6.101) 

We then again reason by contradiction, assuming that a rectangle T' = 1' x J' with |/'| < ^^^g^ was split vertically 
by the algorithm in the chain leading from Tq to T. If ||<?v/||l'"(7'') — 10^' then || V/ll^-^y/j < 105 and Lemma 



shows that T' should not have been split vertically, which is a contradiction. Otherwise ||'?v/||l-(7') — 5 > 
TO ll'^v/llL~(r')' we obtain 

(1 -20/C)er(/) < l|5v/llL-(r)l-^'l < ^er(/). (6.102) 

We now consider the children T', and T^J of T' of maximal error after a horizontal and vertical split respectively, 
and we inject ( |6. 102[ ) in jOoTJ. It follows that 

er ./,(/) =e7-;(/) 

<ll'5v/!lL-(r)|/|/2+105|/'| 
< ler(f)+20e{N)/C 
<{l+20/C)erif)='i,er{f), 

and 

>{¥yf\\L''ir)-m 
> UdJ\\L-(r)\J'\ 
>l[i-2Q/C)er(f) = ^er(f). 

Therefore eji ,,{f) > eji j, (/) which is a contradiction, since our decision rule would then select a horizontal split. 

We now choose A' large enough so that the minimum in ( |6.96^ , l |6.97[ l and l |6. 100^ is are always equal to the 
second term. For all T e ,%f{To), we respectively find that 



52 if ||V/||^»(j.„) < 105 



liik \^-^f '5v/l if ll'5.v/llL~(ro) > 10^ and ||<9,/llL-(7i) > 105 

5||V/||z.~ if Ufiy^T,) < 105 and ||<9v/||z.-(7i) > 10^ (or reversed). 

with C = max{402, 202(10/9)^, 800} = 1600. For z e fl, we set (//(z) := jf-^ where T e .9n such zeT, and 
obtain 



N = #(^n)= j^V< Ce{N)-^ [fa^^'-f dyf\dxdy + S\\VfU>. + < 
Taking the limit as 5 — !> 0, we obtain 

limsupA'!||/-/A,||z.- <20 J\djd,f\ , 

which concludes the proof. □ 
Remark 6.8 The proof of the Theorem can be adapted to any choice of parameter p e] j , 1 [. 
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6.4 Refinement algorithms for piecewise polynomials on triangles 

As in §5, we work on a polygonal domain Q. C and we consider piecewise polynomial approximation on 
anisotropic triangles. At a given stage of the refinement algorithm, the triangle T that maximizes e,„j{f)p is split 
from one of its vertices a; C {ai,a2,a2} towards the mid-point bj of the opposite edge e,. Here again, we may 
replace e„,j{f)p by the more computable quantity \\f — PmjfWp for selecting the triangle T of largest local error. 

If T is the triangle that is selected for being split, we denote by {T-', T") the two children which are obtained 
when T is split from a/ towards h/. The most natural decision rule is based on comparing the three quantities 



which represent the local approximation error on T after the three splitting options, with the standard modification 
when p = <x. The decision rule based on the V error is therefore : 



ipUtfrom ai towards bjfor an i that minimizes ej^i{f)p. 



T is s 

A convergence analysis of this anisotropic greedy algorithm is proposed in 121] in the case of piecewise affine 
functions corresponding to m = 2. Since it is by far more involved than the convergence analysis presented in 
§6. 1 , §6.2 and §6.3 for piecewise constants on rectangles, but possess several similar features, we discuss without 
proofs the main available results and we also illustrate their significance through numerical tests. 

No convergence analysis is so far available for the case of higher order piecewise polynomial m > 2, beside a 
general convergence theorem similar to Theorem |6.5| The algorithm can be generalized to simplices in dimension 
d > 2. For instance, a 3-d simplex can be split into two simplices by a plane connecting one of its edges to the 
midpoint of the opposite edge, allowing therefore between 6 possibilities. 

As remarked in the end of §6.1, we may use a decision rule based on a local error measured in another 
norm than the LP norm for which we select the element T of largest local error. In [2II|, we considered the 
"L^ -projection" decision rule based on minimizing the quantity 

^r./(/)2:=(||/-/'2,^'(/)llL2(7-,, + ||/-/'2,r,"(/)lli2(7.„))'^', 
as well as the "L°°-interpolation" decision rule based on minimizing the quantity 

dTAf)2 ■■= Wf-hj/imL-iTn + 1I/-/2,7;"(/)1Il~(7;"). 

where l2j denotes the local interpolation operator: hj{f) is the affine function that is equal to / at the vertices 
of T . Using either of these two decision rules, it is possible to prove that the generated triangles tend to adopt a 
well adapted shape. 

In a similar way to the algorithm for piecewise constant approximation on rectangles, we first discuss the 
behaviour of the algorithm when / is exactly a quadratic function q. Denoting by q its the homogeneous part of 
degree 2, we have seen in §5.1 that when det(q) / 0, the approximation error on an optimally adapted triangle T 
is given by 

e2.T{q)p = e2.T{q)p = \T\'l'K2^p((i), J := - + 1- 
We can measure the adaptation of T with respect to q by the quantity 

^''^^^'^-|r|>A/^2,,(q)' 

which is equal to 1 for optimally adapted triangles and small for "well adapted" triangles. It is easy to check that 
the functions (q, T) Oji^ijp are equivalent for all p, similar to the shape functions K2_p as observed in §5.2. 

The following theorem, which is a direct consequence of the results in 1211 . shows that the decision rule tends 
to make "most triangles" well adapted to q. 

Theorem 6.9 There exists constants < 6 , ^ < 1 and a constant Cp that only depends on p such that the follow- 
ing holds. For any q £ H2 such that det(q) 7^ and any triangle T, after j refinement levels of T according to 
the decision rule, a proportion 1 — 6^ of the 2-' generated triangles T' satisfies 

(7q{T')p < mm{^ijaAT)p,Cp}. (6.103) 

As a consequence, for j > j(q, T) = log^ "'^^ 

oAT')p<Cp, (6.104) 
for a proportion 1 — 0-' of the 2 ' generated triangles T' . 
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This result should be compared to Proposition |6.3| in the case of rectangles. Here it is not possible to show 
that all triangles become well adapted to q, but a proportion that tends to 1 does. It is quite remarkable that with 
only three splitting options, the greedy algorithm manages to drive most of the triangles to a near optimal shape. 
We illustrate this fact on Figurepl in the case of the quadratic form q{x,y) := + lOOy^, and an initial triangle T 
which is equilateral for the euclidean metric and therefore not well adapted to q. Triangles such that C7q(r')2 < C2 
are displayed in white, others in grey. We observe the growth of the proportion of well adapted triangles as the 
refinement level increases. 




Figure 6: Greedy refinement for q{x,y) :—x^ + lOOy^: j ^2 (left), j = 5 (center), j — S (right). 



From an intuitive point of view, we expect that when we apply the anisotropic greedy refinement algorithm 
to an arbitrary function / G C^{Q.), the triangles tend to adopt a locally well adapted shape, provided that the 
algorithm reaches a stage where / is sufficiently close to an quadratic function on each triangle. As in the case 
of the greedy refinement algorithm for rectangles, this may not always be the case. It is however possible to 
prove that this property holds in the case of strictly convex or concave functions, using the "L°° -interpolation" 
decision rule. This allows to prove in such a case that the approximation produced by the anisotropic greedy 
algorithm satisfies an optimal convergence estimate in accordance with Theorem |5.2| These results from 1211 can 
be summarized as follows. 

Theorem 6.10 Iff is a function such that d^f{x) > al or (fif{x) < — al, for all x^Q. and some a > 0, then 
the triangulation generated by the anisotropic greedy refinement algorithm (with the L°° -interpolation decision 
rule) satisfies 

lim max At- =0. (6.105) 

Moreover, there exists a constant C > such that for any such f, the approximation produced by the anisotropic 
greedy refinement algorithm satisfies the asymptotic convergence estimate 



\ims\xpNe2,.%{f)p<C 



det(d2/)l 



-:=- + !. (6.106) 
T p 



For a non-convex function, we are not ensured that the diameter of the elements tends to as A' ^ 00, and 
similar to the greedy algorithm for rectangles, it is possible to produce examples of smooth functions / for which 
the approximation produced by the anisotropic greedy refinement algorithm fails to converge towards /. A natu- 
ral way to modify the algorithm in order to circumvent this problem is to impose a type of splitting that tend to 
diminish the diameter, such as longest edge or newest vertex bisection, in the case where the refinement suggested 
by the original decision rule does not sufficiently reduce the local error. This means that we modify as follow the 
decision rule: 



Case 1: if min{e7- i (/),,, 67-2 (/);,, er,3(/)p} 5; P«2.r(/);j> then split T from a,- towards hi for an ; that mini- 
mizes ej i{f )p. We call this a greedy split. 

Case 2: if minjeT- j (/)y,,e7- 2(/)p,e7-.3(/)p} > Psl.T{f)p-< thsn split T from the most recently generated ver- 
tex or towards its longest edge in the euclidean metric. We call this a safety split. 

As in modified greedy algorithm for rectangles, p is a parameter chosen in ]0, 1[ that should not be chosen 
too small in order to avoid that all splits are of safety type which would then lead to isotropic triangulations. It 
was proved in |20| that the approximation produced by this modified algorithm does converge towards / for any 
/ e Li'(Q.). The following result also holds for the generalization of this algorithm to higher degree piecewise 
polynomials. 
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Theorem 6.11 For any f e LP{Q.) or in C{0.) in the case p = °°, the approximations produced by the modified 
anisotropic greedy refinement algorithm with parameter p e]0, 1[ satisfies 

limsupe2,.:7„(/)p=0. (6.107) 

Similar to Theorem |6.6[ we may expect that the modified anisotropic greedy refinement algorithm satisfies 
optimal convergence estimates for all function, but this is an open question at the present stage. 

Conjecture. There exists a constant C > and p* e]0, 1[ such that for any f e C^, the approximation pro- 
duced by the modified anisotropic greedy refinement algorithm with parameter p G]p* , 1 [ satisfies the asymptotic 
convergence estimate \6.\Q6\ . 

We illustrate the performance of the anisotropic greedy refinement algorithm algorithm for a function / which 
has a sharp transition along a curved edge. Specifically we consider 

/(x,>') = /6(x,.v) -gsi^x^+y^), 

where gg is defined by = 5^ for Q <r < 1, gg(l + 5 + r) = — ^ '^^ for r > 0, gg is a polynomial 

of degree 5 on [1, 1 + 5] which is determined by imposing that gg is globally C^. The parameter 8 therefore 
measures the sharpness of the transition. We apply the anisotropic refinement algorithm based on splitting the 
triangle that maximizes the local -error and we therefore measure the global error in l?. 

Figure [7] displays the triangulation 5^0000 obtained after 10000 steps of the algorithm for 8 = 0.2. In par- 
ticular, triangles T such that CTq(r)2 < C2 - where q is the quadratic form associated with d^f measured at the 
barycenter of T - are displayed in white, others in grey. As expected, most triangles are of the first type therefore 
well adapted to /. We also display on this figure the adaptive isotropic triangulation produced by the greedy tree 
algorithm based on newest vertex bisection for the same number of triangles. 




Since / is a C function, approximations by uniform, adaptive isotropic and adaptive anisotropic triangula- 
tions all yield the convergence rate ff{N^^ ). However the constant 

C:=limsupA'e2„%,(/)2, 

strongly differs depending on the algorithm and on the sharpness of the transition. We denote by Cu, Cj and Ca 
the empirical constants (estimated by A' 1 1 / — /a( 1 1 2 f or W = 8 1 92) in the uniform, adaptive isotropic and adaptive 
anisotropic case respectively, and by U{ f) := ||(i^/||£2, /(/) := ||d^/||/,2/3 and A(/) := || v1det(dV)l||^2/3 the 
theoretical constants suggested by the convergence estimates. We observe on Figure[8] that Cu and Cj grow in 
a similar way as (/(/) and /(/) as 5 ^ (a detailed computation shows that U{f) ^ 10.375"^/^ and /(/) 
14.0l5-'/2) In contrast Ca and A (/) remain uniformly bounded, a fact which is in accordance with Theorem 
|5.4| and reflects the superiority of anisotropic triangulations as the layer becomes thinner and fg tends to a cartoon 
function. 

We finally apply the anisotropic refinement algorithm to the numerical image of Figure [4] based on the dis- 
cretized error and using A' = 2000 triangles. We observe on Figure [9] that the ringing artefacts produced 
by the isotropic greedy refinement algorithm near the edges are strongly reduced. This is due to the fact that the 
anisotropic greedy refinement algorithm generates long and thin triangles aligned with the edges. We also observe 
that the quality is slightly improved when using the modified algorithm. Let us mention that a different approach 
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Figure 8: Comparison between theoretical and empirical convergence constants for uniform, adaptive isotropic 
and anisotropic refinements, and for different values of 5. 

to the approximation of image by adaptive anisotropic triangulations was proposed in 1271 . This approach is 
based on a thinning algorithm, which starts from a fine triangulation and iteratively coarsens it by point removal. 
The use of adaptive adaptive anisotropic partitions has also strong similarities with thresholding methods based 
on representations which have more directional selectivity than wavelet decompositions I4l ll3|[3ni37l . It is not 
known so far if these methods satisfy asymptotic error estimates of the same form as l |6.106^ . 




Figure 9: Approximation by 2000 anisotropic triangles obtained by the greedy (left) and modified (right) algorithm. 
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